I mostly read scientific papers or technical articles, which I often download as pdf from the arxiv or journal websites. Most pdf files are not descriptively named, so I always end up with folders of many cryptically named files.
To automatically rename the pdf files according to the bibtex information, authors, year, title, journal, …., I tried to use the emacs org-ref functionalities in the past and also wrote my own “bibtex-helper (hbib)” tool in Haskell. However, I always wanted a simpler commandline tool for this job. Some day I found the python program pdf-renamer
, which uses two other packages, pdf2doi and pdf2bib, by the same author to query for reference information online and create bibtex entries.
The options for the renaming format was not freely customizable to my liking, but the underlying libraries are great to use in python scripts, so one can quickly write the desired functionality oneself.
I just installed the package via pip
pip install pdf2bib
… made the following script executable ($ chmod +x …) and copied it to ~/.local/bin/pdfrename
to have it in the PATH.
#!/usr/bin/env ipython
import argparse
from pathlib import Path
import pdf2bib
= argparse.ArgumentParser(
parser ="pdfrename",
prog='Auto rename PDF files into "{FirstAuthorLastname}{Year}_{Title}.pdf"',
description="Based on Michele Cotrufo's pdf2doi, pdf2bib, and pdf-renamer",
epilog
)"filename")
parser.add_argument(
def main():
= parser.parse_args()
args = args.filename
filename
= pdf2bib.pdf2bib_singlefile(filename)
result = result["metadata"]
metadata
= f"{metadata['author'][0]['family'].lower()}{metadata['year']}_{'-'.join([w.lower() for w in metadata['title'].split()])}.pdf"
new_filename
print(filename)
print("-->")
print(new_filename)
Path(filename).rename(new_filename)return
if __name__ == "__main__":
main()