Given
df = pd.DataFrame({'group': [1, 1, 2, 1, 1], 'value':['a','b','c','d','e']})
I need to treat a and b as one group, c as second group, d and e as third group. How to get first element from every group?
pd.DataFrame({'group': [1, 2, 1,], 'value':['a','c','d']})
CodePudding user response:
Try this:
df1 = df[df['group'].ne(df['group'].shift())]
Check this answer for more details
CodePudding user response:
You haven't specified if the group
column tells whether the values are considered to be in the same group. So I'm assumming it has no connection, and you specify your groups in the groups
list:
groups = [['a', 'b'], ['c'], ['d', 'e']]
condlist = [df['value'].isin(group) for group in groups]
choicelist = list(range(len(groups)))
group_idx = np.select(condlist, choicelist)
df.groupby(group_idx).first()
Result:
group value
0 1 a
1 2 c
2 1 d
CodePudding user response:
You can create your groups and map them to a reduced output:
df = pd.DataFrame({'group': [1, 1, 2, 1, 1], 'value':['a','b','c','d','e']})
groups = [['a', 'b'], ['c'], ['d', 'e']]
mappings = {k: i for i, gr in enumerate(groups) for k in gr}
print(
df.groupby(df['value'].map(mappings)).first()
)
group value
value
0 1 a
1 2 c
2 1 d