I have a dataframe like this
| A | B | C |
|-------|---|---|
| ['1'] | 1 | 1 |
|['1,2']| 2 | |
| ['2'] | 3 | 0 |
|['1,3']| 2 | |
if the value of B is equal to A within the quotes then C is 1. if not present in A it will be 0. Expected output is:
| A | B | C |
|-------|---|---|
| ['1'] | 1 | 1 |
|['1,2']| 2 | 1 |
| ['2'] | 3 | 0 |
|['1,3']| 2 | 0 |
Like this I want to get the dataframe for multiple rows. How do I write in python to get this kind of data frame?
CodePudding user response:
df['C'] = np.where(df['B'].astype(str).isin(df.A), 1,0)
basically you need to transform column b to string since column A is string. then seek for column B inside columnA.
result will be as you are defined.
CodePudding user response:
If values in A
are strings use:
print (df.A.tolist())
["['1']", "['1,2']", "['2']", "['1,3']"]
df['C'] = [int(str(b) in a.strip("[]'").split(',')) for a, b in zip(df.A, df.B)]
print (df)
A B C
0 ['1'] 1 1
1 ['1,2'] 2 1
2 ['2'] 3 0
3 ['1,3'] 2 0
Or if values are one element lists use:
print (df.A.tolist())
[['1'], ['1,2'], ['2'], ['1,3']]
df['C'] = [int(str(b) in a[0].split(',')) for a, b in zip(df.A, df.B)]
print (df)
A B C
0 [1] 1 1
1 [1,2] 2 1
2 [2] 3 0
3 [1,3] 2 0
CodePudding user response:
My code:
df = pd.read_clipboard()
df
'''
A B
0 ['1'] 1
1 ['1,2'] 2
2 ['2'] 3
3 ['1,3'] 2
'''
(
df.assign(A=df.A.str.replace("'",'').map(eval))
.assign(C=lambda d: d.apply(lambda s: s.B in s.A, axis=1))
.assign(C=lambda d: d.C.astype(int))
)
'''
A B C
0 [1] 1 1
1 [1, 2] 2 1
2 [2] 3 0
3 [1, 3] 2 0
'''