Home > Mobile >  Writing a function to shorten the DataFrame code
Writing a function to shorten the DataFrame code

Time:12-16

I want to use a function to shorten my Pandas library code. How should I use a function to shorten the code and call it for an individual country?

Scotland ='Scotland'
Finland = 'Finland'
sc = df.query("Area == @Scotland and Element == 'Import Quantity'")
fin = df.query("Area == @Finland and Element == 'Import Quantity'")
etc..

sc_v = df.query("Area == @Scotland and Element == 'Import Value'")
fin_v = df.query("Area == @Finland and Element == 'Import Value'")
etc..

There are bunch of tutorials of using the functions (classes), but none of them show how to use them in case of a dataframe (df). In this case, I just want to avoid repeating and Element == 'Import Quantity'") and and Element == 'Import Value'") strings, and need a professional way of writing the code, since there are plenty of countries and I will have to analyze each of them and, at the end, merge and visualize.

My code is not advanced.

CodePudding user response:

What you want exactly is unclear, but a programmatic option could be to us groupby and a dictionary comprehension:

# example input
df = pd.DataFrame({'Area': ['Scotland', 'Finland', 'Sweden', 'Scotland', 'Finland', 'Sweden'],
                   'Element': ['Import Quantity', 'Import Quantity', 'Import Quantity',
                               'Import Quantity', 'Import Value', 'Other'],
                   'Other': [1, 2, 3, 4, 5, 6]
                  })
# map suffixes
elem_grp = df['Element'].map({'Import Quantity': '', 'Import Value': '_v'})

# split dataframe
out = {f'{a.lower()[:3]}{e}': g for (a,e), g in
       df.groupby(['Area', elem_grp])}

Then:

out['fin_v']
      Area       Element  Other
4  Finland  Import Value      5
  • Related