I'm trying to filter data based on two columns and delete the rows with values less than 1 and that has NaN.
df = pd.DataFrame('col_1':[0,4,6,0,0,7,9,3,1],'col_2':[0,0,0,2,1,14,3,4,5])
This is what I have tried but not getting the result I want
df.sort_values(by = ['col_1', 'col_2'], ascending = False)
cond = df[(df['col_1']>1.0) & (df['col_2']>1.0)].index
df.drop(cond, inplace= True)
Not sure if I should split the columns first and filter out then join the table together.
I want the output to look like this:
df = pd.DataFrame('col_1':[4,6,7,9,3,1],'col_2':[2,1,14,3,4,5])
CodePudding user response:
Use concat
with filtered columns and default index by Series.reset_index
:
df = pd.concat([df.loc[df['col_1']>=1.0, 'col_1'].dropna().reset_index(drop=True),
df.loc[df['col_2']>=1.0, 'col_2'].dropna().reset_index(drop=True)], axis=1)
print (df)
col_1 col_2
0 4 2
1 6 1
2 7 14
3 9 3
4 3 4
5 1 5
CodePudding user response:
You will need to split the two columns first then drop the Nan & values less than 0 and then again merge the columns. You can do it using the below code.
df = pd.DataFrame({'col_1':[0,4,6,0,0,7,9,3,1,np.nan],'col_2':[0,0,0,2,1,14,3,4,5, np.nan]})
df1 = df[['col_1']]
df2 = df[['col_2']]
df11 = df1.loc[(df1.col_1 > 1) & (df1.col_1.isnull() == False)].reset_index(drop = True)
df21 = df2.loc[(df2.col_2 > 1) & (df2.col_2.isnull() == False)].reset_index(drop = True)
df_fin = pd.concat([df11, df21], axis = 1)
df_fin.head()
This is how the output will look
col_1 | col_2 |
---|---|
4.0 | 2.0 |
6.0 | 14.0 |
7.0 | 3.0 |
9.0 | 4.0 |
3.0 | 5.0 |