Home > Enterprise >  How to solve issue with groupby on a dataframe
How to solve issue with groupby on a dataframe

Time:11-04

I wish to have a final dataframe with a index of projects, a list of exchanges and a last column for price.

Here is an example:

data = {'Exchange': ['coinbase', 'binance', 'coinbase', 'ftx','coinbase'], 'Projects': ['Bitcoin', 'Bitcoin', 'Ethereum', 'Ethereum','Doge'],'Price': [10,5,10,2,10]}

df = pd.DataFrame(data)

Output : 
   Exchange  Projects  Price
0  coinbase   Bitcoin     10
1   binance   Bitcoin      5
2  coinbase  Ethereum     10
3       ftx  Ethereum      2
4  coinbase      Doge     10

Here is what I tried

df2 = df.groupby(by=["Projects"]).count()


df2['Price'] = df['Price']

df2['Exchange'] = df['Exchange']

df2

Output:

          Exchange  Price
Projects        
Bitcoin     NaN      NaN
Doge        NaN      NaN
Ethereum    NaN      NaN

What I wish to have:

            Exchange          Price
Projects        
Bitcoin     coinbase,binance  10
Doge        coinbase,ftx      2
Ethereum    ftx               5

CodePudding user response:

Use groupby_agg:

>>> df.groupby('Projects').agg({'Exchange': ','.join, 'Price': 'last'})

                  Exchange  Price
Projects                         
Bitcoin   coinbase,binance      5
Doge              coinbase     10
Ethereum      coinbase,ftx      2

You can replace 'last' by another function like 'max', 'mean', 'min' or a custom function.

CodePudding user response:

In your case

out = df.groupby('Projects').agg({'Exchange': ','.join,'Price':'last'})
Out[35]: 
                  Exchange  Price
Projects                         
Bitcoin   coinbase,binance      5
Doge              coinbase     10
Ethereum      coinbase,ftx      2
  • Related