11.27.07
The power of little tools
I always come back to the power of Unix tools, and the ideology of using many small tools in stages to achieve a goal rather than crafting a monolithic solution. Tonight was a great example how how a little automation can save a huge amount of time.
My dad has a massive stack of Excel spreadsheets called Book1.xls, Book2.xls etc etc. Each is representative of a single day, and needs renaming to reflect this. Like any good Windows user, he was opening up each one, finding the line with the date in, closing the spreadsheet, renaming Book43.xls to 07 06 2007.xls, and moving on. Ironically he was working on his Mac for this as it’s for user friendly. So - prime from some automation.
Firstly, all of the books he has renamed already needed renaming in the format YYYY_MM_DD.xls so that they can be ordered easily. Step in a bit of ksh programming.
Firstly - copy them all off to one side, and work on the copy. It is very easy to cock up and cp -pr making knocking a copy off to one side easy.
I like working from a driver file, so I did an “ls *.xls > list.txt”, then start a shell script that loops through the driver list using “for i in `cat list.txt`” and split out the three bits of the file name by piping it through awk, and spitting it out of the other side in a new order. To do this I used “echo $i | awk -F” ” ‘{print $3 “_” $2 “_” $1}’. With a little more pipe magic, I used this output to mv all the files around and job done.
Next, and more interesting, renaming all those Book*.xls to YYYY_MM_DD. The Unix command strings pulls ASCII out of binary files, so doing “strings Book1.xls” gives back a whole pages of text. One of the lines has “rubbish rubbish Date: 9 September 2007″. This is excellent news, because we can use awk again, splitting by “:”, and then select out column 2 with “awk -F”:” ‘{print $2}’. This splits back “9 September 2007″ which we can then use with awk again, splitting by spaces this time with awk -F” “, and use the same trick as the first set of files to rename the Book1.xls to $3_$2_$1 (2007_september_9.xls in this case).
OK, so far too much detail in there for the cusual user, and I imagine there are a million and one neater ways to do these things if you ask a really good hacker, but the point I was going to make was this this is only possible because of the Unix philosophy of being able to plug together little tools in a million and one combinations. All I used here was ls (to list the files), mv (to move them), awk (to separate things out separating by different characters) and cat (to read out files). The whole job took about an hour (including all the time to work out how), and saved at least 5 hours. It would scale too, saving 500 hours for the same investment of 1, and it’s just not possible on todays dumbed down Windows world. Shame.

O said,
November 28, 2007 at 10:05 am
Well…. and it is a bit gash…. and a blatant Perl rip-off, but…. Windows PowerShell can sort of do the same job for you in Windowsland. Not very prettily, but worth a try for interest’s sake alone….
coldclimate said,
November 28, 2007 at 11:40 am
Hmm, news to me, but definately worth a look. Ironic that Microshite feel the need to add “Power” to the name of what is just called a shell in every other OS.
Ben F-W said,
December 4, 2007 at 5:36 pm
Assume you’ve seen http://www.itwire.com/content/view/15449/1141/? Similar topic, more of a historical context.
Ben F-W said,
December 4, 2007 at 5:40 pm
Hmm. Remove the question mark - it seems to send it to an entirely new link.
http://www.itwire.com/content/view/15449/1141/