![]() ![]() set thePath to POSIX path of (path to desktop as alias) & "myArchive.webarchive" Update of my Script, now everything looks okey in Script Editor at least. If any have the same problem when using pbcopy… or use pbcopy to pipe bash script to AppleScript. So what does that line do… it will change pbcopy to use utf-8 encoding. To include this line “export _CF_USER_TEXT_ENCODING=0x1F5:0x8000100:0x8000100”īefore the command textutil the text is same as in terminal. ![]() The things was… terminal shell use utf-8 as default but not pbcopy command. The thing that hit me for hours was the text output from textutil, it look fine in terminal.īut it was not correct when I use pbcopy to pipe the text back to Script Editor. I realize textutil is very powerful tool… That could be a good idea before doing any search. Mark, textutil has cat function and if there is multiply files it will concatenate them and What would be the best approach to be able to search in webarchive, find matching, extract text from it ? I do understand I have to clean the code somehow… hmmm The result of Script Editor is not same as I do directly in command-line… Set out to do shell script "textutil -cat txt " & quoted form of thePath & space & "-stdout " & "|" & "pbcopy" Here is a fast AppleScript… set thePath to POSIX path of (path to desktop as alias) & "myArchive.webarchive" If I choose to do it with textutil everything are done in background and that is great. This format are more close to rtf format. Very interesting if I need to edit and later… for printing. I also find out that doc, docx and wordml had very good output in TextEdit. So I was thinking about using apple textutil command to convert or cat to txt format, do find string matching. I do not like to import the webarchive to Safari to be able to extract text or copy… I also know that Spotlight, mdfind and maybe other could search inside this format. ![]() I know QuickLook and Safari and (TextEdit in limited way) read this binary plist file. I use webarchive as a way to read documents… but also to archive. (See downside #1.I’m a big fan of Safari webarchive format and many times I find it to be better and covert to PDF. (An inconvenience if you, or someone to whom you send an archive, prefers to use a browser other than Safari a show-stopper if you send an archive to someone using anything other than Tiger-including Windows.) The second is that if you ever need to get at any of the content of an archive-images or text, for example-you must use Safari to first open the archive, then grab the content from there. The first is that these Web archives can be viewed only in Safari you can’t open them in another browser. This is a great feature however, it has two downsides. webarchive file in Safari and it will (roughly) look as if you were viewing the page normally via the Internet. You perform this task by viewing the desired Web page, choosing Save As from Safari’s File menu, and then choosing Web Archive from the Format pop-up menu in the Save dialog. One of the (welcome) additions to version 2 of Safari, included with Tiger, is the ability to save an entire Web page-text, images, and all-for offline viewing. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |