Home > Blockchain >  How can I filter a dataframe by columns for the first time a string is not present?
How can I filter a dataframe by columns for the first time a string is not present?

Time:02-17

I'm trying to subset dataframes to exclude the first column without "Unnamed."

Here's an example:

data = {'What is your favorite fruit':['Banana','nan','Banana','nan','nan'],
        'Unnamed:12':['nan', 'Strawberry', 'nan', 'nan', 'nan'],
       'Unnamed:13':['nan', 'nan', 'nan', 'Blueberry', 'Blueberry'],
       'What is your favorite vegetable?':['Carrot','nan','nan','nan','Carrot']}

df = pd.DataFrame(data)

df

What I want is to only subset the data and take the first 3 columns and exclude the new question. In my actual file the number of columns between questions differs, so doing iloc won't work.

CodePudding user response:

To get every column until and including the last column with "Unnamed", try:

>>> df.iloc[:, :max(i for i, c in enumerate(df.columns) if "Unnamed" in c) 1]

  What is your favorite fruit  Unnamed:12 Unnamed:13
0                      Banana         nan        nan
1                         nan  Strawberry        nan
2                      Banana         nan        nan
3                         nan         nan  Blueberry
4                         nan         nan  Blueberry
  • Related