Home > other >  Create a new column evaluating values in different rows
Create a new column evaluating values in different rows

Time:07-31

Starting from an imported df from excel like that:

Code Time Rev
AAA 5 3
AAA 3 2
AAA 6 1
BBB 10 2
BBB 5 1

I want to add a new column like that evidence the last revision:

Code Time Rev Last
AAA 5 3 OK
AAA 3 2 NOK
AAA 6 1 NOK
BBB 10 2 OK
BBB 5 1 NOK

The df is already sorted by 'Code' and 'Rev'

df= df.sort_values(['Code', 'Rev'],
         ascending = [True,False])

I thought to evaluate the column 'Code', if the value in column Code is equal to the value in upper row I must have NOK in the new column.

Unfortunately, I am not able to write it in python

CodePudding user response:

You can do:

#Create a column called 'Last' with 'NOK' values
df['Last'] = 'NOK'

#Skipping sorting because you say df is already sorted.  

#Then locate the first row in each group and change its value to 'OK'
df.loc[df.groupby('Code', as_index=False).nth(0).index, 'Last'] = 'OK'

CodePudding user response:

You can use pandas.groupby.cumcount and set every first row of group to 'OK'.

dict_ = {
    'Code': ['AAA', 'AAA', 'AAA', 'BBB', 'BBB'],
    'Time': [5, 3, 6, 10, 5],
    'Rev': [3, 2, 1, 2, 1],
    }
df = pd.DataFrame(dict_)
df['Last'] = 'NOK'
df.loc[df.groupby('Code').cumcount() == 0,'Last']='OK'

This gives us the expected output:

df
  Code  Time  Rev Last
0  AAA     5    3   OK
1  AAA     3    2  NOK
2  AAA     6    1  NOK
3  BBB    10    2   OK
4  BBB     5    1  NOK

or you can try fetching the head of each group and set the value to OK for it.

df.loc[df.groupby('Code').head(1).index, 'Last'] = 'OK'

which gives us the same thing

df
  Code  Time  Rev Last
0  AAA     5    3   OK
1  AAA     3    2  NOK
2  AAA     6    1  NOK
3  BBB    10    2   OK
4  BBB     5    1  NOK
  • Related