Home > Software design >  Using contain or isin for comparing 2 values in a data frame
Using contain or isin for comparing 2 values in a data frame

Time:09-27

given a data frame (df), I would like to determine if the current Last Name contains the previous row Last Name value. I attempted using a for loop where I had an if statement of...

if df['Last Name'][index].str.contains(df['Last Name'][index-1]):
    print('yes')

The code above is incorrect. What is the best way of finding if a value of a data frame at a certain index contains another value of a data frame at a different index? I am not looking for if that value is in the column, I am only looking to see if it exists in previous row of the column.

Edit: I am not finding if both values are equal, I want to find if the current row values contains previous row value.

CodePudding user response:

You can use pandas' shift to move Last Name by by one row and check if they equal:

df['Last Name shifted'] = df['Last Name'].shift(periods=1)
equals_df = (df['Last Name shifted'] == df['Last Name shifted'])

Now equals_df is a boolean series with True / False - you can do whatever you want with it (check indices with 'True', check how many are Trues etc)

CodePudding user response:

Solved: if df['Last Name'][index] in df['Last Name'][index-1]:

  • Related