I am trying to find a way to represent conditions for np.where()
other than from within the code. In my example below,
import pandas as pd
import numpy as np
file='insert path'
df = pd.read_csv(file)
df.loc[:, ['col_a','col_b']] = df.loc[:, ['col_a','col_b']].astype(str)
dfseg=df['col_a'] '-' df['col_b'] '-' df['col_c']
df['col_d'] = np.where((df['col_a']=='101')|(df['col_a']=='337')|(df['col_a']=='524'),dfseg,df['col_a'])
df
the conditions are that
a. df['col_a']=='101' or
b. df['col_a']=='337' or
c. df['col_a']=='524'
Is there any way these conditions could be represented external to the code? Or through another python method/function? Can tkinter
extract this portion of the code and edit these conditions? Hoping to find a way these conditions could be reviewed and managed, etc. externally rather than within the script.
I have tried np.isin()
and np.in1d
but received False results:
print(np.in1d(df1['col_a'],df2['col_a']))
print(np.isin(df1['col_a'],df2['col_a']))
Output:
[False False False False False False]
[False False False False False False]
From the following:
df2:
col_a
0 101
1 524
2 337
df1:
col_a col_b col_c
0 101 104 AAB
1 337 103 CAD
2 524 204 DER
3 129 404 EEH
4 842 108 HHR
5 337 108 HHE
df2[col_a].isin df1[col_a]
0 False
1 False
2 False
3 False
4 False
5 False
Name: col_a, dtype: bool
df1[col_a].isin df2[col_a]
0 True
1 True
2 True
Name: col_a, dtype: bool
Converting df to numpy with
df1numpy = df1.to_numpy()
Similarly yielded all "False" matches with np.isin()
. Thanks for your thoughts.
CodePudding user response:
Turns out, after converting df to numpy, the array needs to be transposed as well:
df1numpy1stcol = np.transpose(df1numpy)[0]
(df2 does not need to be converted to numpy, updated below).
Then, change datatype from object:
df1numpy1stcol.astype(np.int32)
Finally, the np.where()
could be used with np.isin()
:
df1['col_d'] = np.where(np.isin(df1numpy1stcol.astype(np.int32),df2),
dfseg,df1['col_a'])
thus, df1 [output] as:
col_a | col_b | col_c | col_d | |
---|---|---|---|---|
0 | 101 | 104 | AAB | 101-104-AAB |
1 | 337 | 103 | CAD | 337-103-CAD |
2 | 524 | 204 | DER | 524-204-DER |
3 | 129 | 404 | EEH | 129 |
4 | 842 | 108 | HHR | 842 |
5 | 337 | 108 | HHE | 337-108-HHE |
The exceptions list is now represented as an editable list named "df2".