given a data frame (df), I would like to determine if the current Last Name contains the previous row Last Name value. I attempted using a for loop where I had an if statement of...
if df['Last Name'][index].str.contains(df['Last Name'][index-1]):
print('yes')
The code above is incorrect. What is the best way of finding if a value of a data frame at a certain index contains another value of a data frame at a different index? I am not looking for if that value is in the column, I am only looking to see if it exists in previous row of the column.
Edit: I am not finding if both values are equal, I want to find if the current row values contains previous row value.
CodePudding user response:
You can use pandas' shift to move Last Name by by one row and check if they equal:
df['Last Name shifted'] = df['Last Name'].shift(periods=1)
equals_df = (df['Last Name shifted'] == df['Last Name shifted'])
Now equals_df
is a boolean series with True / False - you can do whatever you want with it (check indices with 'True', check how many are Trues etc)
CodePudding user response:
Solved: if df['Last Name'][index] in df['Last Name'][index-1]: