0 1 2 3 4 5 6 7 8 9
0 1 Биир биир NUM num NumType=Card _ _ _ _
1 2 паартаҕа паарта NOUN n Case=Dat|Number=Sing _ _ _ _
2 3 киһи киһи NOUN n Case=Nom|Number=Sing _ _ _ _
3 4 олорор олор VERB v Person=3|Tense=Pres _ _ _ _
4 5 . . PUNCT punct _ _ _ _ _
0 1 2 3 4 5 6 7 8 9
0 1 Биир _ _ _ _ _ _ _ _
1 2 уол _ _ _ _ _ _ _ _
2 3 турар _ _ _ _ _ _ _ _
3 4 уонна _ _ _ _ _ _ _ _
4 5 ааҕар _ _ _ _ _ _ _ _
5 6 . _ _ _ _ _ _ _ _
How do I replace specific columns if a value from the second df is in the first one?
df2[1].isin(df1[1])
0 True
1 False
2 False
3 False
4 False
5 True
For all True
, replace columns 2,3,4,5
.
The output should be this:
0 1 2 3 4 5 6 7 8 9
0 1 Биир биир NUM num NumType=Card _ _ _ _
1 2 уол _ _ _ _ _ _ _ _
2 3 турар _ _ _ _ _ _ _ _
3 4 уонна _ _ _ _ _ _ _ _
4 5 ааҕар _ _ _ _ _ _ _ _
5 6 . . PUNCT punct _ _ _ _ _
I tried using where but it gives me an error that the length of 2 dfs is different.
df2[[2, 3, 4, 5]].where(df2[1].isin(df1[1]), df1[[2, 3, 4, 5]].values)
Is there any other way to replace multiple columns by a specific condition?
CodePudding user response:
one way is to: 1.concat, 2.drop_duplicates, 3.filter, 4.sort, here goes:
df = pd.concat([df2, df1]).drop_duplicates('1', keep='last')
df = df[df['1'].isin(df2['1'])].sort_values('0')
df:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Биир | биир | NUM | num | NumType=Card | _ | _ | _ | _ |
1 | 2 | уол | _ | _ | _ | _ | _ | _ | _ | _ |
2 | 3 | турар | _ | _ | _ | _ | _ | _ | _ | _ |
3 | 4 | уонна | _ | _ | _ | _ | _ | _ | _ | _ |
4 | 5 | ааҕар | _ | _ | _ | _ | _ | _ | _ | _ |
4 | 5 | . | . | PUNCT | punct | _ | _ | _ | _ | _ |