Recently, it was necessary to scan a large number of documents dating back almost ten years. When the files were complete, Spotlight dutifully re-indexed everything and suddenly it was impossible to find anything. These were primarily monthly statements, with virtually all of them having current statements that were suddenly superseded in my search results by documents from many years ago.

Fortunately, having gone through this before, I already knew that if I named the files in a standard format they could be easily stamped with the historic date.

YYYYMMDD_MY_FILENAME.PDF

If you are familiar with the UNIX touch command, the YYYYMMDD format matches exactly what it needs to seed a timestamp change. For example, if you want a file to match the date we landed on the moon:

touch -t 196907200000 YOUR_FILENAME.TXT

The last four digits represent HH:MM, and I guess if you were motivated enough this could be tuned somewhat. The final result is a single line script that descends the current directory looking for files beginning in the 8 digit pattern and then converting that to a touch command to modify the creation date:

for file in `find ./ -name \*.pdf -print | egrep \\\\d{8}\_`; do dt=`basename $file | cut -f1 -d\_`0000; echo "Setting $file to $dt"; touch -t $dt $file; done

Not only does this help with searching, it is useful for sorting results when date/time are involved.

Finally, after scanning almost a thousand documents with PDFScanner, this application has proven itself indispensable. For anyone familiar with paper document capture, scanning is only half the battle – you may need to deskew, cleanup a background, OCR and re-duplex double-sided pages. PDFScanner does all of this automatically, all you need to do is feed the documents in and press Save every so often.

Categories: AppleLinux