Datamining with Devonthink Pro OFFICE
If you're new here, you may want to subscribe to my RSS feed or to email alerts. Thanks for visiting!

Last semester I once again was extremely grateful for having Devonthink to help me out with a research paper.
For those that don’t know Devonthink is a personal database you keep on your Mac. You throw your pdfs, documents, links, e-mails into it and leverage it’s powerful AI and search technologies to help with research and writing. It looks a bit clunky, and kills my Powerbook’s CPU, but is an invaluable program if you do any amount of research.
I had a public policy research paper that required me to investigate the Australia-United States Free Trade Agreement, and determine whether policy networks were at work in relation to the Pharmaceutical Benefits Scheme (they were but weakly).
There was a Senate Select Committee and Joint Standing Committee report on the FTA, the submissions of which would have been an absolute gold mine for researching the positions of various players and their connections. The problem? There were over 700 separate submissions, some of which were just scans of letters.
Thankfully, I had Devonthink to help me out.
First, Devonthink allows you to take a URL and it will automatically find and download all the links on that page.

So I fed it the Committee site that had links to all the submissions. Devonthink then went to work downloading over 700 pdfs for me.

Second, Devonthink Pro Office has built in OCR software, so despite the best efforts of Parliament to make their documents unsearchable, after running the OCR I had a massive searchable database of submissions.

Third, I used Devonthink’s search function to limit my enquiry to only those submissions that mention the PBS. This cut back the number of pdfs to only around 200 - down from 700! Quite a time saver.
Finally, I also used the search function to see if any of the players were mentioning each other in their submissions. Allowing me to establish if there were any links between the different groups.
All of this cut down on the amount of time I had to waste trawling through submissions manually, giving me more time to actually think about and write the paper. I ended up getting a good mark thanks to being able to quickly find all the relevant documents and linkages between them.
Now if only I could figure out a good referencing program (that actually saved time not made more work!) my student life would be made a easier.
Popularity: 21% [?]

