Home > Enterprise >  Sum columns to one value
Sum columns to one value

Time:05-21

I have a dataset with data of a set of bands who play in the same two-city festival. I aim to create a pandas dataframe of two colums, Band and total attendance. but I am unable to use the df.groupby argument as the same bands are mentioned in two different columns. Any tips? Below is an example of how the dataset looks.

F.ex
Time   Band_in_London    Band_in_Reading   Attendance_London   Attendance_Reading
 12    Kasabian          Queen             3000                4000
 13    Foo Fighters      Beatles           1000                5000
 14    U2                A-ha              2500                2500
 15    Queen             Kasabian          2200                1000
 16    Beatles           Foo Fighters      1300                4700
 17    A-Ha              U2                4000                3100
 18

CodePudding user response:

Using pd.wide_to_long:

import pandas as pd

# Notice there is a typo for the A-ha.
df = df.replace({'A-ha':'A-Ha'})

attendance = (pd.wide_to_long(df, ['Attendance', 'Band_in'], 'Time', 'City', sep='_', suffix=r'\w ').
              groupby('Band_in').sum()
             )
print(attendance)
              Attendance
Band_in                 
A-Ha                6500
Beatles             6300
Foo Fighters        5700
Kasabian            4000
Queen               6200
U2                  5600

CodePudding user response:

df = original_dataframe[['Band_in_London', 'Attendance_London']]
df = df.rename({'Band_in_London': 'Band', 'Attendance_London': 'Attendance'}, axis=1)
df = df.append(
     original_dataframe[['Band_in_Reading', 'Attendance_Reading']].rename(
     {'Band_in_Reading': 'Band', 'Attendance_Reading': 'Attendance'}, axis=1))
df = df.groupby(['Band']).sum()
  • Related