Home > Software engineering >  Change column B values for all subsequent occurences of column A values in pandas dataframe
Change column B values for all subsequent occurences of column A values in pandas dataframe

Time:10-14

Take this table with two colums:

Fruit First Occurence?
"Apple" "Yes"
"Orange" "Yes"
"Apple" "Yes"
"Apple" "Yes"
"Kiwi" "Yes"
"Banana" "Yes"
"Apple" "Yes"
"Orange" "Yes"
"Orange" "Yes"
"Banana" "Yes"

Made using this:

import pandas as pd

df = pd.DataFrame({
    'Fruit': ["Apple", "Orange", "Apple", "Apple", "Kiwi", "Banana", "Apple", "Orange", "Orange", "Banana"],
    'First Occurence?': ["Yes"]*10
})

How can I efficiently change values of the second column where the value in the first column has already appeared at least once? The desired result being this:

Fruit First Occurence?
"Apple" "Yes"
"Orange" "Yes"
"Apple" "No"
"Apple" "No"
"Kiwi" "Yes"
"Banana" "Yes"
"Apple" "No"
"Orange" "No"
"Orange" "No"
"Banana" "No"
  • It can't be a loop as this takes too long
  • The rows must end up in the same order they started

Thanks in advance!

CodePudding user response:

you can use duplicated to identify the duplicate values and then using mask, assign a NO

# duplicated Fruits will come up as False, and First_occurrence be made to No
df['First Occurence?']= df['First Occurence?'].mask(df.duplicated('Fruit'),'No')
    Fruit   First Occurence?
0   Apple   Yes
1   Orange  Yes
2   Apple   No
3   Apple   No
4   Kiwi    Yes
5   Banana  Yes
6   Apple   No
7   Orange  No
8   Orange  No
9   Banana  No

CodePudding user response:

Use duplicated and numpy.where:

df['First Occurence?'] = np.where(df['Fruit'].duplicated(), 'No', 'Yes')

More generic alternative with groupby.cumcount that enable to select any nth item (0-based):

df['First Occurence?'] = np.where(df.groupby('Fruit').cumcount().eq(0), 'Yes', 'No')

output:

    Fruit First Occurence?
0   Apple              Yes
1  Orange              Yes
2   Apple               No
3   Apple               No
4    Kiwi              Yes
5  Banana              Yes
6   Apple               No
7  Orange               No
8  Orange               No
9  Banana               No
  • Related