I have a problem. On the one hand, I would like to filter or select my dataframe and on the other hand, I would like to output only certain columns directly. I can't find anything (maybe I'm missing the terms). Is there an option to filter directly in pandas in only one row and at the same time output only certain columns?
import pandas as pd
d = {'id': [1, 2, 3, 4, 5, 1],
'name': ['Max Power', 'Jessica', 'Xi', 'Jackson', 'Frank', 'Miller'],
'secondName': ['Full', 'Miller', 'Hu', 'Johnny', 'High', 'Joachim'],
}
df = pd.DataFrame(data=d)
display(df)
df_new = df[df['id'] == 1]
display(df_new[['id', 'name']])
# df_new = df[['id', 'name'], df['id'] == 1] # TypeError Name: id, dtype: bool)' is an invalid key
[OUT]
id name secondName
0 1 Max Power Full
1 2 Jessica Miller
2 3 Xi Hu
3 4 Jackson Johnny
4 5 Frank High
5 1 Miller Joachim
id name
0 1 Max Power
5 1 Miller
What I want
df[df['id'] == 1, ['id','name']]
id name
0 1 Max Power
5 1 Miller
CodePudding user response:
Use DataFrame.loc
- first is defined boolean mask and then list of columns names:
print(df.loc[df['id'] == 1, ['id','name']])
id name
0 1 Max Power
5 1 Miller