I want to add a column to my dataframe (with repetitive values) which contains the name of previous group (float and sortable). any idea?
Before:
Group | value |
---|---|
1.1 | value1 |
1.1 | value2 |
1.1 | value3 |
1.1 | value3 |
1.2 | value4 |
1.2 | value4 |
1.2 | value5 |
After:
Group | value | Previous Group |
---|---|---|
1.1 | value1 | Nan |
1.1 | value2 | value1 |
1.1 | value3 | value2 |
1.1 | value3 | value2 |
1.2 | value4 | Nan |
1.2 | value4 | Nan |
1.2 | value5 | Value4 |
CodePudding user response:
you can use groupby
with shift
which shifts the Group 1 place for each group:
df['Previous Group'] = df.groupby("value")['Group'].shift()
print(df)
Group value Previous Group
0 1.1 value1 NaN
1 1.1 value2 NaN
2 1.2 value1 1.1
3 1.2 value2 1.1
EDIT:
We can use a helper column with groupby ngroup and then use it as a mapping column and filter out rank = 1:
a = df.assign(helper=df.groupby(["Group","value"]).ngroup())
rnk = a.groupby("Group")['helper'].rank(method='dense')
df['Previous Group'] = (a['helper'].sub(1).map(dict(zip(a['helper'],a['value'])))
.mask(rnk.eq(1)))
print(df)
Group value Previous Group
0 1.1 value1 NaN
1 1.1 value2 value1
2 1.1 value3 value2
3 1.1 value3 value2
4 1.2 value4 NaN
5 1.2 value4 NaN
6 1.2 value5 value4