Over the last weekend I worked on redesigning my workflow of handling PDFs while taking into account the many new capabilities of the Zotero literature management.
As they come in, PDFs usually have neither unique nor intelligent names. Renaming them to their creation date is not very informative. Using the title is a much better option but usually flawed by strange characters or long file names. The Pubmed ID is also nice (and what I used over the last 5 years) but is time consuming to look up and many documents do not have any Pubmed ID. Finally I came up with the idea of using the MD5 hash of the title as it can be used by every operating system and nearly every paper.
When discussing this with the BMC people they suggested to go for an Author-JournalAbbreviation-Year.pdf name although this may not be always unique. At least this name can be directly constructed from the Zotero entry.
While I haven’t finally decided what to use as a name for my PDFs I am working now on two tools
- The first one renames a PDF immediately after being downloaded by the meta data stored in the Zotero database. Test it at PDF rename based on Zotero entry
- The second one will rename a PDF based on its content and is intended for already available PDFs. It is basically an AJAX page that is able to browse through an archive of PDFs and check the extracted keywords directly against Pubmed which will lead to an entry suitable fro Zotero. Test it at PDF rename based on Pubmed entry
I have now renamed about 300 PDFs which worked pretty well so there is a good chance that they will remain in my tool box.