Archive pour la catégorie 'Scripting'

NYT: 24 hours to convert 11 millions images to PDF

Jeudi 1 novembre 2007

Derek Gottfrid of the New York Times on how they converted 11 millions TIFF images of the NYT archives to PDF using Amazon EC2/S3 services.

I had been using Amazon S3 service for some time and was quite impressed. And in late 2006 I had begun playing with Amazon EC2. So the the basic idea I had was this: upload 4TB of source data into S3, write some code that would run on numerous EC2 instances to read the source data, create PDFs, and store the results back into S3. S3 would then be used to serve the PDFs to the general public. It all sounded pretty simple, and that is how I got the folks in charge to agree to such an idea — not to mention that Amazon S3/EC2 is pretty easy on the wallet.

Derek Gottfrid, Self-service, Prorated Super Computing Fun!, nytimes.com

Open in Path Finder from the terminal

Mercredi 2 mai 2007

On OS X, tuck this function in you .bashrc file to open from Terminal.app a repertory in Path Finder.

function pathfinder {
	open -a "Path Finder" $1
}

Then in terminal, to open the current directory do
% pathfinder .

You can open also a specified directory:
% pathfinder /Users/me/Desktop

Since Path Finder can call appropriate applications or open in itself different type of documents, you can also do the following:
% pathfinder ShellScripting.pdf

Trapping Errors with simplexml for Not Well-Formed XML

Mercredi 7 février 2007

I discovered the hard way that in PHP5 there are no obvious ways to detect if some XML is well-formed, especially if you want to deploy on Unix/Windows platform and don’t want to access the shell directly.

Adding to this problem, I discovered also that the DOM and simplexml extensions can’t use the PHP5 exception handling to trap the errors when the XML is not well-formed. Using simplexml or the DOM extensions against not well-formed XML, the errors generated by these extensions are not trapped and are displayed immediately.

It’s possible to load with the DOM or the Tidy extensions not well-formed XML, and then repair it on the fly. But what if you need to detect not well-formed XML and provide a message stating the error?

Fortunately, after some research, I found that you could use the libxml functions (PHP 5.1 and over) to test XML well formedness and trap XML errors. So, I wiped out this little function called get_xml_object (see here for the inspiration) that allow me to trap errors when simplexml is used to parse XML. The function is quite simple, by default, you provide a path to a XML file. If you want to use a string, just add another argument after the first parameter (it can’t be anything, but here’s I chose “string” for clarity sakes). You can also replace the simplexml extension by the DOM extensions if you prefer this extension to parse XML.

The function get_xml_object will return an array that contains two keys, errors and xml. In this example, $result=get_xml_object($s, "string"), $result is an array. If there are no errors, $result['errors'] will be set to null. If everything is ok, $result['xml'] will contains a simplexml object that you can then manipulate with the simplexml extension.

$s = "tag>hello world</tag>";
// $s = "<tag>hello world</tag>";

function get_xml_object ($xml, $xmlFormat="file") {

  $xml_object = null;
  $result = array ("errors" => null, "xml" => null);

  libxml_use_internal_errors (true);
  $xmlFormat == "file"  ? $xml_object = simplexml_load_file ($xml)
                        : $xml_object = simplexml_load_string ($xml);

  if (!$xml_object) {
     $errors = libxml_get_errors();
     foreach ($errors as $error) {
         $error_msg = "Error: line: " . $error->line
                    . ": column: " . $error->column . ": "
                    . $error->message . "n";
     }
     libxml_clear_errors();
     $result["errors"] = $error_msg;
  } else {
    $result["xml"] = $xml_object;
  }
  return $result;
}

$result = get_xml_object ($s, "string");

if ($result['errors']) {
  var_dump ($result['errors']);
} else {
  var_dump ($result['xml']);
}

Path Finder: Script to Avoid Warning when Closing a Window with Tabs

Vendredi 5 janvier 2007

With Path Finder 4.6.1, when you close a window with tabs, you get an alert that’s asking if you really want to close all the tabs. This is very annoying. On the Path Finder forum, there is a mention of that, its supposed to be on the todo list for the developpers. Mainwhile, you can use this AppleScript if you want to get rid of the warning. Just put the script in your AppleScript Menu, or better, in your FastScripts menu, set a shortcut (mine is control-w) and there you go. Its not ideal, and I rather attach this script to the File menu of Path Finder, but for that, you need PreFab UI Actions because Path Finder is not “AppleScript attachable”. Well, that’s another reason tempting me to buy this app, and also PreFab UI Browser.

tell application "Path Finder"
  activate
  try
    set allWindows to every finder window
    set mainWindow to item 1 of allWindows
    set go to true
    repeat while go
      try
        set mainWindowName to name of mainWindow as string
        if mainWindowName is equal to "" then return
        tell application "System Events"
          keystroke "w" using command down
        end tell
      on error
        return
      end try
    end repeat
  on error
    return
  end try
end tell

So the only way I have found is to ask each time for the name of the front window, and if the name doesn’t exist, then it means that there are no more windows to process. Since we took a reference to the main window with set mainWindow to item 1 of allWindows, this made sure that we won’t close a Path Finder window that is behind the main Path Finder window.