I have a dataframe and and a list:
import pandas as pd
df = pd.DataFrame()
df['Orig'] = ['CA', 'OH', 'WA','CA','CA','IN','FL']
df['Dest'] = ['CA', 'CA', 'OH','WA','MI','WA','MA']
lst = ['WA','CA','OH']
I would like to do the follwing...
If value for Orig and Dest is the lst, then set column Category to 'T', else 'F'. How can I accomplish that? Thank you.
orig dest
0 CA CA
1 OH CA
2 WA OH
3 CA WA
4 CA MI
5 IN WA
6 FL MA
Desire output:
orig dest category
0 CA CA T
1 OH CA T
2 WA OH T
3 CA WA T
4 CA MI F
5 IN WA F
6 FL MA F
CodePudding user response:
You can use isin
and all
to generate a boolean series for numpy.where
:
import numpy as np
df['category'] = np.where(df.isin(lst).all(axis=1), 'T', 'F')
Or, if more columns in the input, restrict to the chosen ones by slicing:
df['category'] = np.where(df[['orig', 'dest']].isin(lst).all(axis=1), 'T', 'F')
Alternatively, you can also select with: df['orig'].isin(lst) & df['dest'].isin(lst)
.
Output:
Orig Dest category
0 CA CA T
1 OH CA T
2 WA OH T
3 CA WA T
4 CA MI F
5 IN WA F
6 FL MA F