Home > OS >  Use rows values from a pandas dataframe as new columns label
Use rows values from a pandas dataframe as new columns label

Time:05-07

If I have a pandas dataframe it's possible to get values from a row and use it as a label for a new column?

I have something like this:

| Team| DateTime| Score
| Red| 2021/03/19 | 5
| Red| 2021/03/20 | 10
| Blue| 2022/04/10 | 20

I would like to write this data on a new dataframe that has:

Team Column

Year/Month SumScore Column

So I would have a row per team with multiple new columns for a month in a year that contains the sum of the score for a specific month. It should be like this:

Team 2021/03 2022/04
Red 15 0
Blue 0 20

The date format time is YYYY/MM/DD

I hope I was clear

CodePudding user response:

You can use

df = (df.assign(YM=df['DateTime'].str.rsplit('/', 1).str[0])
      .pivot_table(index='Team', columns='YM', values='Score', aggfunc='sum', fill_value=0)
      .reset_index())
print(df)

YM  Team  2021/03  2022/04
0   Blue        0       20
1    Red       15        0

CodePudding user response:

We can use pd.crosstab which allows us to

Compute a simple cross tabulation of two (or more) factors

Below I've changed df['DateTime'] to contain year/month only.

df['DateTime'] = pd.to_datetime(df['DateTime']).dt.strftime('%Y/%m')
pd.crosstab(
    df['Team'],
    df['DateTime'],
    values=df['Score'],
    aggfunc='sum'
).fillna(0)

If you don't want multiple levels in the index, just use the method call reset_index on your crosstab and then drop DateTime.

  • Related