Home > database >  Sum columns based on multiple lists with whitespace replacement in Pandas
Sum columns based on multiple lists with whitespace replacement in Pandas

Time:09-25

I want to create three sum columns based on the items from each list. The process is to replace the whitespace with underscore before summing the columns. I was trying to do a loop instead of doing a list comprehension one by one, but I might have missed out something in the loop. How can I achieve my expected result?

import pandas as pd
fruits = ['apple pie', 'watermelon pie', 'banana pie']
places = ['Hong Kong', 'Boston', 'New York']

df = pd.DataFrame({
    'apple_pie': [3, 4, 5],
    'watermelon_pie': [3, 4, 5],
    'New_York': [6, 7, 8]
})

xup = ['fruits', 'places', 'persons']
yup = [fruits, places, persons]

for y in yup:
  for x in xup:
    try:
        df[x]= df[[y.replace(" ", "_") for y in yup]].sum(axis = 1)
    except:
        continue

Expected output:

   apple_pie  watermelon_pie  New_York fruits  places
0          3               3         6      6       6
1          4               4         7      8       7
2          5               5         8     10       8

CodePudding user response:

You need to loop through xup and yup in parallel using zip instead of nesting them:

for sum_col, cols in zip(xup, yup):
    cols = [x.replace(' ', '_') for x in cols]
    df[sum_col] = df[df.columns.intersection(cols)].sum(1)

df    
   apple_pie  watermelon_pie  New_York  fruits  places  persons
0          3               3         6       6       6      0.0
1          4               4         7       8       7      0.0
2          5               5         8      10       8      0.0
  • Related