Indexing pandas dataframe with array as iloc input

I have a pandas dataframe(df) which I want to index to only display columns where the total sum within the column is not zero. I am using the .to_numpy().nonzero() method to create a tuple of non-zero indexes. I checked the pandas.DataFrame.iloc documentation and found that only arrays / lists of int are available for indexing, so I change this tuple of non-zero indexes to a list:

import pandas as pd
#...
df = pd.read_table(f)
df_sum = df.sum(axis = 0)
df_no_0_tuple = df_sum.to_numpy().nonzero() #--> prints (array([ 7,  8,  9, 10, 11, 25, 26, 27, 28, 29, 31, 32, 36], dtype=int64),)
print(type(df_no_0_tuple)) #--> prints "<class 'tuple'>"
df_no_0 = list(df_no_0_tuple)
print(type(df_no_0)) #--> prints "<class 'list'>"
print(df_no_0) #--> prints [array([ 7,  8,  9, 10, 11, 25, 26, 27, 28, 29, 31, 32, 36], dtype=int64)]
df_final = df.iloc[:, df_no_0]
print[df_final]

As mentioned in the title: If I use the df_no_0 as iloc input I get the error: "ValueError: Buffer has wrong number of dimensions (expected 1, got 2)" Question here is: Does the df_no_0 list also include the part "dtype=int64" which then causes the dimensions error (expected 1, got 2)? If so, is there a way to remove or not even create the type information when using the list conversion?

If I use the tuple directly from to_numpy().nonzero() I get the error: "pandas.core.indexing.IndexingError: Too many indexers". I think the different error here might be caused because there are now three indexers separated by commata "," within the tuple compared to the list. The question remains for me: How can I index the dataframe correctly using the output of to_numpy().nonzero() or how can I transform that output to be a viable input for the .iloc indexing?

BTW: If I just enter the output of to_numpy().nonzero() in list format manually, the indexing works as expected. This will just be tedious for indexing multiple files without previously knowing their non-zero columns.

Any help is greatly appreciated!

Thank you in advance!

CodePudding user response：

Could something like this work for you?

Example of dataframe where some columns sum up to zero:

df = pd.DataFrame([[1, 3, 0, 2, 0],
                   [3, 4, 0, 4, 0],
                   [5, 1, 0, 3, 0]],
                   columns = ['a', 'b', 'c', 'd', 'e'])

To remove the columns that only add up to zero:

df.loc[:, (df.sum() != 0)]