I wish to have a final dataframe with a index of projects, a list of exchanges and a last column for price.
Here is an example:
data = {'Exchange': ['coinbase', 'binance', 'coinbase', 'ftx','coinbase'], 'Projects': ['Bitcoin', 'Bitcoin', 'Ethereum', 'Ethereum','Doge'],'Price': [10,5,10,2,10]}
df = pd.DataFrame(data)
Output :
Exchange Projects Price
0 coinbase Bitcoin 10
1 binance Bitcoin 5
2 coinbase Ethereum 10
3 ftx Ethereum 2
4 coinbase Doge 10
Here is what I tried
df2 = df.groupby(by=["Projects"]).count()
df2['Price'] = df['Price']
df2['Exchange'] = df['Exchange']
df2
Output:
Exchange Price
Projects
Bitcoin NaN NaN
Doge NaN NaN
Ethereum NaN NaN
What I wish to have:
Exchange Price
Projects
Bitcoin coinbase,binance 10
Doge coinbase,ftx 2
Ethereum ftx 5
CodePudding user response:
Use groupby_agg
:
>>> df.groupby('Projects').agg({'Exchange': ','.join, 'Price': 'last'})
Exchange Price
Projects
Bitcoin coinbase,binance 5
Doge coinbase 10
Ethereum coinbase,ftx 2
You can replace 'last'
by another function like 'max'
, 'mean'
, 'min'
or a custom function.
CodePudding user response:
In your case
out = df.groupby('Projects').agg({'Exchange': ','.join,'Price':'last'})
Out[35]:
Exchange Price
Projects
Bitcoin coinbase,binance 5
Doge coinbase 10
Ethereum coinbase,ftx 2