How to eliminate certain rows from a dataframe-CodePudding

Hello so i have a dataframe that looks like this:

df= {'country_name':['Albania', 'Algeria', 'Andorra', 'Angola'],'commodity_code':['55','55','55','55'],'year':[2000,2000,2000,2000],'trade_value':[10000,12000,'NaN',105]}

Essentially, this is a long dataframe in which I have many countries from 2000 to 2020 and the trade value for commodity "55" and commodity "73". What I need is to eliminate those countries that never exported commodity 55 and commodity 73. I need to eliminate those countries that for every year for each commodity the column trade value equals 0, not NaN, thus they never exported the commodity.

Thanks.

CodePudding user response：

You can use isin() and negate it.

import pandas as pd

data = {
    'country_name': ['Albania', 'Algeria', 'Andorra', 'Angola'],
    'commodity_code': ['55', '55', '55', '55'],
    'year': [2000, 2000, 2000, 2000],
    'trade_value': [10000, 12000, 'NaN', 105],
    }
df = pd.DataFrame(data)
df = df[~df['commodity_code'].isin(['55', '73'])]

Output:

country_name    commodity_code  year    trade_value

CodePudding user response：

In [1]: import pandas as pd
   ...:
   ...: data = {
   ...:     'country_name': ['Albania', 'Algeria', 'Andorra', 'Angola'],
   ...:     'commodity_code': ['55', '55', '55', '55'],
   ...:     'year': [2000, 2000, 2000, 2000],
   ...:     'trade_value': [10000, 12000, 'NaN', 0],
   ...:     }
   ...: df = pd.DataFrame(data)

In [2]: cond = df["commodity_code"].isin(["55", "73"]) & df["trade_value"].isin([0, "NaN"])

In [3]: df.drop(df[cond].index)
Out[3]:
  country_name commodity_code  year trade_value
0      Albania             55  2000       10000
1      Algeria             55  2000       12000

In [4]: