I have a simple dataframe:
import pandas as pd
import numpy as np
df = pd.DataFrame(columns = ['name', 'last', 'test_num', 'grade'])
df = df.append({'name': 'name_a',
'last': 'last_a',
'test_num': 1,
'grade': 90},
ignore_index=True)
df = df.append({'name': 'name_a',
'last': 'last_a',
'test_num': 2,
'grade': 100},
ignore_index=True)
df = df.append({'name': 'name_a',
'last': 'last_a',
'test_num': 3,
'grade': 95},
ignore_index=True)
df = df.append({'name': 'name_a',
'last': 'last_b',
'test_num': 1,
'grade': 50},
ignore_index=True)
df = df.append({'name': 'name_a',
'last': 'last_b',
'test_num': 2,
'grade': 55},
ignore_index=True)
df = df.append({'name': 'name_b',
'last': 'last_b',
'test_num': 1,
'grade': 90},
ignore_index=True)
df = df.append({'name': 'name_b',
'last': 'last_b',
'test_num': 2,
'grade': 100},
ignore_index=True)
df.head(10)
output:
name last test_num grade
0 name_a last_a 1 90
1 name_a last_a 2 100
2 name_a last_a 3 95
3 name_a last_b 1 50
4 name_a last_b 2 55
5 name_b last_b 1 90
6 name_b last_b 2 100
I want to create a new dataframe with the following values:
name last
0 name_a last_a
1 name_a last_b
2 name_b last_b
I have tried to use groupby:
df2 = df.groupby(['name', 'last'])['name', 'last']
but the return result is pandas.core.groupby.generic.DataFrameGroupBy
How can I get the output I want as pandas.core.frame.DataFrame
?
CodePudding user response:
You can use nth(0)
, head(1)
, tail(1)
, first()
or last()
to get one row in groupby object
df2 = df.groupby(['name', 'last'], as_index=False)[['name', 'last']].nth(0)
print(df2)
name last
0 name_a last_a
3 name_a last_b
5 name_b last_b
CodePudding user response:
You can try to concat
your grouped dataframe to convert it to DataFrame
df3 = pd.concat(dict(iter(df2)).values())