Home > database >  Renaming scientific paper PDFs from one name pattern to another name pattern
Renaming scientific paper PDFs from one name pattern to another name pattern

Time:10-20

I am trying to automate the renaming of PDFs of scientific papers from one name pattern to another using python.

The name pattern the PDFs occur in looks like this:

Cresswell, K., Worth, A., & Sheikh, A. (2011). Implementing and adopting electronic health record systems. Clinical governance- an international journal.

i.e. "LastName1, FirstLetterGivenName1., LastName2, FirstLeterGivenName2., [...]. (Year). Title. Journal."

The name pattern of this example should be renamed to looks like this:

Cresswell_K_2011_Implementing and adopting

i.e "LastName1_FirstLetterGivenName1_Year_First3LettersTitle"

Sadly I was unable to apply the solutions to similar problems to this specific one, as I am just starting to code.

CodePudding user response:

You can use regular expression, like this for example:

import re

s = "Cresswell, K., Worth, A., & Sheikh, A. (2011). Implementing and adopting electronic health record systems. Clinical governance- an international journal."

p = re.compile(r'(?P<LastName1>[A-Za-z] ),\s (?P<GivenName1>[A-Za-z] )\.?,. \((?P<Year>\d )\)\.\s (?P<Title1>\w )\s(?P<Title2>\w )\s(?P<Title3>\w )')
m = p.search(s)
if m is not None:
    d = m.groupdict()
    result = d['LastName1']  '_'  d['GivenName1'][0]  '_'  d['Year']  '_'  d['Title1']  ' '  d['Title2']  ' '  d['Title3']
    print(result)

this gives the output:

Cresswell_K_2011_Implementing and adopting

  • Related