I am trying to remove ranges of columns in my pandas df. I would prefer to do it in one line but the only method I know is iloc, which doesn't seem to allow multiple references. When I wrote it in separate lines, the columns I don't want remain. Can someone help me with a better way of doing this? Thanks!

import pandas as pd
df = pd.DataFrame({'id': [100,200,300], 'user': ['Bob', 'Jane', 'Alice'], 'income': [50000, 60000, 70000], 'color':['red', 'green', 'blue'], 'state':['GA', 'PA', 'NY'], 'day':['M', 'W', 'Th'], 'color2':['red', 'green', 'blue'], 'state2':['GA', 'PA', 'NY'], 'id2': [100,200,300]})

df.drop(df.iloc[:, 0:2], inplace=True, axis=1)
df.drop(df.iloc[:, 4:5], inplace=True, axis=1)
df.drop(df.iloc[:, 7:9], inplace=True, axis=1)

I'd like the output from the code above to contain columns 'color' and 'color2'

CodePudding user response：

try the np.r_

answer based on your question, prior to you editing it:

 import pandas as pd
 import numpy as np
    
 idx = np.r_[0:12, 63:70, 73:78, 92:108]
 policies.drop(df.columns[idx], axis = 1, inplace = True)

answer based on your given example:

import pandas as pd
import numpy as np

idx = np.r_[0:2, 4:5, 7:9]
df.drop(df.columns[idx], axis = 1, inplace = True)

PS: the np.r_is exclusive, meaning [0:3], column at position 3 will not be droped.

hope this helps.

CodePudding user response：

You could do:

df = df.drop(df.columns[[*range(0,2), *range(4,5), *range(7,9)]], axis=1)

Output:

   income  color day color2
0   50000    red   M    red
1   60000  green   W  green
2   70000   blue  Th   blue