Home > database >  Drop columns based on a row value in Pandas
Drop columns based on a row value in Pandas

Time:11-03

I'm working on a dataframe in pandas and I would like to drop some columns based on the sum of the column values. If the total is 0, then I drop the column, otherwise I keep it. I've been trying to do it but I haven't found the way. Any tips?

CodePudding user response:

Simpliest is filter if sum is not equal 0, so it means are rmeoved columns with sum=0:

df.loc[:, df.sum().ne(0)]

If also non numeric columns use DataFrame.select_dtypes with add non numeric by DataFrame.reindex:

df = pd.DataFrame(np.random.randint(0,100,size=(3, 4)), columns=list('ABCD'))
df['E'] = 0
df['G'] = 'aa'


df = df.loc[:, df.select_dtypes(np.number).sum().ne(0).reindex(df.columns, fill_value=True)]
print (df)
    A   B   C   D   G
0  47  42  16   2  aa
1  81  72  19  55  aa
2  28  70  79  29  aa

CodePudding user response:

So what I have done is the following:

1st: sum all the columns with numerical value

df_pivot_2_sum = df_pivot_2.sum(numeric_only = True)

2nd: filter only the zeroes

df_pivot_2_sum = df_pivot_2_sum[df_pivot_2_sum == 0]

3rd: get the list of columns

columns_zero_value = list(df_pivot_2_sum.index.values)

4th: drop the columns from that list

df_pivot_2 = df_pivot_2.drop(columns_zero_value, axis=1)

Thanks all for your support!

  • Related