[This guest post reprints, with permission, Jason Heppler’s post from Saturday. Jason is a Ph.D. candidate in history at the University of Nebraska, Lincoln, where he is also project manager for the William F. Cody Archive. He’s also the author of The Rubyist Historian. You can follow him on Twitter at @jaheppler.--JBJ]
On a recent episode of Gabe Weatherhead‘s Generational podcast he spoke with Walton Jones, professor of Behavioral Neurobiology at Korea Advanced Institute of Science and Technology. They talk about Professor Jones’ system for annotating and summarizing academic papers about twenty minutes into the podcast. He’s further detailed his academic workflow on his blog, so be sure to give his explanation a read.
I’ve noted before how I manage my PDFs using the filesystem and Open Meta tagging. I’ve tended to maintain my notes in plain text written directly into DEVONthink, but after listening to Weatherhead’s talk with Jones and reading his post I’ve decided to adopt part of his system.
As a scientist Jones spends much of his time synthesizing the latest research that normally comes to him as a PDF from journals. Where I became interested in his system was 1) his color coded annotations and 2) his method of extracting those annotations to plain text. His system uses colors for different notes, green for references, red for summaries, and so on. Where the system really inspired me was his AppleScript that can process the PDF he has marked up (either in Skim or iAnnotate that scans the PDF and extracts notes based on his categorization using Markdown syntax. He then dumps the notes into VoodooPad. Be sure to read his explanation of his system as my summary doesn’t do it complete justice. [Update: I was remiss in pointing out that the original AppleScript adapted by Walton Jones came from John Sidiropoulos at his blog OrganoGnosi. His blog has lots of advice on using digital tools for academic research.]
The system relies on an AppleScript that looks for annotations in the PDF and extracts the text into Markdown-formatted plain text. I modified the script slightly for my own needs, and made it available here via GitHub’s Gist service. [JBJ: We can’t yet embed Gists directly, alas! Apologies for any inconvenience.] For the curious, here’s a partial screenshot of the code:
I take my notes in Skim, which would result in something like:
When the script is run on a PDF, it results in a note formatted in Markdown that looks similar to this:
That’s where the other half of the magic comes in Jones’s system. The note not only includes the text I wanted but also a hyperlink to the page of a particular reference. Transformed into Markdown, the note allows me to click on the reference and be taken back to the source. My notes used to appear similarly, often taking a form such as:
 Noting the page number in brackets followed by my notes, thoughts, direct quotes, and so on from a PDF or book.
As I mentioned, my notes were previously entered directly into DEVONthink. But with this new system I’ll be keeping my notes in the same directory as the document I’m taking notes on. From there, DEVONthink will index the directory for easy searching and organizing.
Update on 11/29: Readers have pointed out that the hyperlinking to specific pages isn’t working the way it should. The solution, near as I’ve been able to replicate the problem, points to just how unrobust this system is, unfortunately.
Walton Jones had to work around the problem by writing his own custom URL scheme. You may need to adopt his system to get everything working. But Skim seems to handle page numbers without any problems, at least for me. There are a few things to bear in mind when using the script: 1) The notes must appear in the same directory as the PDF, and 2) the file must match exactly the text in the note (which the script should handle for you). So, for example, example-article.pdf#page=3 must correspond exactly to example-article.pdf and reside in the same directory as the notes, otherwise it doesn’t know where the file is located. Also, be sure that no spaces are included in the filename, otherwise the Markdown linking will not work.
The other area that makes the system tricky to use is the way I’m using it. When I transform the text notes into Markdown, I save the resulting markup as a PDF (either transformed in Marked.app or fed through wkpdf). The PDF file of my notes is opened in Skim, which can handle linking back to the article because all of these actions are happening within the same application. In other words, if you are planning on using the hyperlinking system as I use it, you will need to contain all activity in Skim. Otherwise, you may need to look into Walton Jones’s custom URL scheme. Relatedly, you might check out John Sidiropoulos’s brilliant post on DEVONthink and hyperlinks.