Home > OS >  python - transpose rows to columns, and calculate new column value as sum of another column
python - transpose rows to columns, and calculate new column value as sum of another column

Time:04-21

I have a dataframe below:

df = pd.DataFrame({'channels' : ['EMAIL','FAX','MAIL','PHONE','Marketing','SMS','VISIT','Profiling','Approved_Email','EMAIL','FAX','MAIL','PHONE','Marketing','SMS','VISIT','Profiling','Approved_Email_vod'],
                   'ID' : [1001, 1002, 1003, 1004, 1005, 1006, 1001, 1002, 1003, 1004, 1005, 1006, 1001, 1002, 1003, 1004, 1005, 1006],
                   'INTR_COUNT' : [1,1,1,1,1,1,1,2,3,4,5,6,1,2,3,4,5,6],
                   'PERSONA' : ['A','B','C','A','B','C','A','B','C','A','B','C','A','B','C','A','B','C']})

What i want to do is define a function that will take a df such as the above and create new columns based on the unique categories in 'channels'.

ID EMAIL FAX MAIL PHONE SMS VISIT Marketing Approved_email Persona
1001 1 0 0 1 0 1 0 0 A

And so on for the rest of the IDs

I've written this function so far:

def channel_pivot(df: pd.DataFrame):
    #where df is the df stated above
    x = df
    #subsetting to pivot only on focus cols
    y = df[['channels', 'INTR', 'HCP']] 
    #pivot operation
    y = set_index('ID')
    y1 = y.pivot(columns='channels', values = sum('INTR')).apply(lambda x: pd.Series(x.dropna().values))
    df1 = y1.merge(x, left_index=True, right_on='ID')
    return df1

I can't get the sum function to work inside the pivot function though, so how can I sum up the count of interactions for a given ID for each channel?

CodePudding user response:

You can use .groupby() with .agg():

x = (
    df.groupby(["ID", "channels"])
    .agg({"INTR_COUNT": "sum", "PERSONA": "first"})
    .set_index("PERSONA", append=True)
    .unstack(level=1)
    .droplevel(0, axis=1)
    .fillna(0)
    .astype(int)
    .reset_index()
)
x.columns.name = None
print(x)

Prints:

ID PERSONA Approved_Email Approved_Email_vod EMAIL FAX MAIL Marketing PHONE Profiling SMS VISIT
0 1001 A 0 0 1 0 0 0 1 0 0 1
1 1002 B 0 0 0 1 0 2 0 2 0 0
2 1003 C 3 0 0 0 1 0 0 0 3 0
3 1004 A 0 0 4 0 0 0 1 0 0 4
4 1005 B 0 0 0 5 0 1 0 5 0 0
5 1006 C 0 6 0 0 6 0 0 0 1 0
  • Related