Hello so i have a dataframe that looks like this:
df= {'country_name':['Albania', 'Algeria', 'Andorra', 'Angola'],'commodity_code':['55','55','55','55'],'year':[2000,2000,2000,2000],'trade_value':[10000,12000,'NaN',105]}
Essentially, this is a long dataframe in which I have many countries from 2000 to 2020 and the trade value for commodity "55" and commodity "73". What I need is to eliminate those countries that never exported commodity 55 and commodity 73. I need to eliminate those countries that for every year for each commodity the column trade value equals 0, not NaN, thus they never exported the commodity.
Thanks.
CodePudding user response:
You can use isin()
and negate it.
import pandas as pd
data = {
'country_name': ['Albania', 'Algeria', 'Andorra', 'Angola'],
'commodity_code': ['55', '55', '55', '55'],
'year': [2000, 2000, 2000, 2000],
'trade_value': [10000, 12000, 'NaN', 105],
}
df = pd.DataFrame(data)
df = df[~df['commodity_code'].isin(['55', '73'])]
Output:
country_name commodity_code year trade_value
CodePudding user response:
In [1]: import pandas as pd
...:
...: data = {
...: 'country_name': ['Albania', 'Algeria', 'Andorra', 'Angola'],
...: 'commodity_code': ['55', '55', '55', '55'],
...: 'year': [2000, 2000, 2000, 2000],
...: 'trade_value': [10000, 12000, 'NaN', 0],
...: }
...: df = pd.DataFrame(data)
In [2]: cond = df["commodity_code"].isin(["55", "73"]) & df["trade_value"].isin([0, "NaN"])
In [3]: df.drop(df[cond].index)
Out[3]:
country_name commodity_code year trade_value
0 Albania 55 2000 10000
1 Algeria 55 2000 12000
In [4]: