I want to create a new column that copies the values in column "value" corresponding to the same person in column "Name" but from the immediate previous row with the same name. I want to leave an empty string when there are not previous values for that person.
I tried to use this code, but it doesn't work:
previous_value= []
col_list = df['Name'].values.tolist()
for idx in df.index:
last_name= df['Name'].loc[idx]
last_value= df['Value'].loc[idx]
for i in range(len(col_list)-1):
actual_name= col_list[i 1]
if last_name == actual_name:
previous_value.append(last_value)
else:
previous_followers.append("")
My idea was to transform later the previous_value list into a data frame and then add it to the original data frame.
This is how it should look like:
Name Value Previous_value
1 Andrew 12
2 Marco 10
3 Philips 9
4 Andrew 8 12
5 Oscar 7
6 Peter 15
7 Maria 25
8 Marco 3 10
9 Andrew 7 8
10 Oscar 19 7
11 Oscar 21 19
12 Maria 2 25
Thank you
CodePudding user response:
For loops often don't mix well with pandas. In this case, you want to group by name and then shift the values down by one to create the previous value column. This should do the trick:
>>> df['previous_value'] = df.groupby('Name')['Value'].shift()
>>> df
Name Value previous_value
0 Andrew 12 NaN
1 Marco 10 NaN
2 Philips 9 NaN
3 Andrew 8 12.0
4 Oscar 7 NaN
5 Peter 15 NaN
6 Maria 25 NaN
7 Marco 3 10.0
8 Andrew 9 8.0
9 Oscar 19 7.0
10 Oscar 21 19.0
11 Maria 2 25.0
You can then use fillna('')
on the new column to replace the NaNs with an empty string if desired.
CodePudding user response:
This question was answered previously here. You can use groupby
and shift
to achieve this (although by default you will get NaNs for the first entry, not an empty string.
df = pd.DataFrame({'Name':[1,2,3,1,2,3,1,2,3],'Value':[0,1,2,3,4,5,6,7,8]})
df['Previous_Value'] = df.groupby('Name')['Value'].shift()