Home > Mobile >  Pivoting/Transposing Certain Col's in Pandas DataFrame
Pivoting/Transposing Certain Col's in Pandas DataFrame

Time:04-29


I'm currently working with large .csv files with dates assigned to column headers, repeat metrics types, and client ID's.

Here is a small sample of data similar to what I am working with...

data = {'Client Id': {0: 123987, 1: 123987, 2: 123987, 3: 9871234, 4: 9871234, 5: 9871234}, 
    'Metrics': {0: 'Spend', 1: 'Views', 2: 'Clicks', 3: 'Spend', 4: 'Views', 5: 'Clicks'},
    '3/1/22': {0: '$0.00', 1: '6', 2: '1', 3: '$20.00', 4: '3', 5: '1'},
    '3/2/22': {0: '$0.00', 1: '3', 2: '2', 3: '$10.00', 4: '2', 5: '2'},
    '3/3/22': {0: '$0.00', 1: '3', 2: '3', 3: '$3.00', 4: '5', 5: '4'}}

df = pd.DataFrame(data)

Currently, my dataframe looks like this...


Client Id Metrics 3/1/22 3/2/22 3/3/22
123987 Spend $0.00 $0.00 $0.00
123987 Views 6 3 3
123987 Clicks 1 2 3
9871234 Spend $20.00 $10.00 $3.00
9871234 Views 3 2 5
9871234 Clicks 1 2 4

I need to pivot the date columns and set them as rows in the first col. I also need to break out the unique 'Metrics' types and display them as col headers (with each row displaying their corresponding values).


My goal is to have my final DF look something like this...


Date Client Id Spend Views Clicks
3/1/2022 123987 $0.00 6 1
3/2/2022 1234567 $0.00 3 2
3/3/2022 1234567 $0.00 3 3
3/1/2022 9871234 $20.00 3 1
3/2/2022 9871234 $10.00 2 2
3/3/2022 9871234 $3.00 5 4

I tried using df.melt() to no avail, but possibly I am missing something?

Any help is appreciated!

CodePudding user response:

First melt the date columns, then pivot:

new_df = (
    df
    .melt(id_vars=['Client Id', 'Metrics'], var_name='Date')
    .pivot(columns='Metrics', index=['Client Id', 'Date'], values='value')
    .reset_index()
    .rename_axis(None, axis=1)
)

Output:

>>> new_df
   Client Id    Date Clicks   Spend Views
0     123987  3/1/22      1   $0.00     6
1     123987  3/2/22      2   $0.00     3
2     123987  3/3/22      3   $0.00     3
3    9871234  3/1/22      1  $20.00     3
4    9871234  3/2/22      2  $10.00     2
5    9871234  3/3/22      4   $3.00     5

CodePudding user response:

You can use:

(df.pivot(index='Client Id', columns='Metrics')
   .stack(0)
   .reset_index()
   .rename({'level_1': 'Date'})
   .rename_axis(None, axis=1)
)

output:

   Client Id level_1 Clicks   Spend Views
0     123987  3/1/22      1   $0.00     6
1     123987  3/2/22      2   $0.00     3
2     123987  3/3/22      3   $0.00     3
3    9871234  3/1/22      1  $20.00     3
4    9871234  3/2/22      2  $10.00     2
5    9871234  3/3/22      4   $3.00     5
  • Related