Home > Software engineering >  How do I create a column with a previous value (in column B) when the names (in column A) in current
How do I create a column with a previous value (in column B) when the names (in column A) in current

Time:12-08

I want to create a new column that copies the values in column "value" corresponding to the same person in column "Name" but from the immediate previous row with the same name. I want to leave an empty string when there are not previous values for that person.

I tried to use this code, but it doesn't work:

previous_value= []
col_list = df['Name'].values.tolist()

for idx in df.index:
    last_name= df['Name'].loc[idx] 
    last_value= df['Value'].loc[idx]
    
    for i in range(len(col_list)-1):
        actual_name= col_list[i 1]
        if last_name == actual_name:
            previous_value.append(last_value)
        else:
            previous_followers.append("")

My idea was to transform later the previous_value list into a data frame and then add it to the original data frame.

This is how it should look like:


        Name    Value   Previous_value
1      Andrew   12         
2      Marco    10          
3      Philips   9 
4      Andrew    8           12
5      Oscar     7     
6      Peter    15    
7      Maria    25
8      Marco     3           10
9      Andrew    7           8
10     Oscar    19           7
11     Oscar    21          19
12     Maria     2          25

Thank you

CodePudding user response:

For loops often don't mix well with pandas. In this case, you want to group by name and then shift the values down by one to create the previous value column. This should do the trick:

>>> df['previous_value'] = df.groupby('Name')['Value'].shift()
>>> df
       Name  Value  previous_value
0    Andrew     12             NaN
1     Marco     10             NaN
2   Philips      9             NaN
3    Andrew      8            12.0
4     Oscar      7             NaN
5     Peter     15             NaN
6     Maria     25             NaN
7     Marco      3            10.0
8    Andrew      9             8.0
9     Oscar     19             7.0
10    Oscar     21            19.0
11    Maria      2            25.0

You can then use fillna('') on the new column to replace the NaNs with an empty string if desired.

CodePudding user response:

This question was answered previously here. You can use groupby and shift to achieve this (although by default you will get NaNs for the first entry, not an empty string.

df = pd.DataFrame({'Name':[1,2,3,1,2,3,1,2,3],'Value':[0,1,2,3,4,5,6,7,8]})
df['Previous_Value'] = df.groupby('Name')['Value'].shift()
  • Related