Home > database >  Removing columns for which the column names are a float
Removing columns for which the column names are a float

Time:05-23

I have data as follows:

import pandas as pd

url_cities="https://population.un.org/wup/Download/Files/WUP2018-F12-Cities_Over_300K.xls"
df_cities = pd.read_excel(url_cities)

i = df_cities.iloc[:, 1].notna().idxmax()                  
df_cities.columns = df_cities.iloc[i].tolist()
df_cities = df_cities.iloc[i 1:]
df_cities = df_cities.rename(columns={2020.0: 'City_pop'})  
print(df_cities.iloc[0:20,])

I want to remove all columns for which the column names (NOT COLUMN VALUES) are floats.

I have looked at a couple of links (A, B, C), but I could not find the answer. Any suggestions?

CodePudding user response:

This will do what your question asks:

df = df[[col for col in df.columns if not isinstance(col, float)]]

Example:

import pandas as pd
df = pd.DataFrame(columns=['a',1.1,'b',2.2,3,True,4.4,'c'],data=[[1,2,3,4,5,6,7,8],[11,12,13,14,15,16,17,18]])
print(df)
df = df[[col for col in df.columns if not isinstance(col, float)]]
print(df)

Initial dataframe:

    a  1.1   b  2.2   3  True  4.4   c
0   1    2   3    4   5     6    7   8
1  11   12  13   14  15    16   17  18

Result:

    a   b   3  True   c
0   1   3   5     6   8
1  11  13  15    16  18

Note that 3 is an int, not a float, so its column has not been removed.

CodePudding user response:

my_list=list(df_cities.columns)
for i in my_list:
    if type(i)!=str:
        df_cities=df_cities.drop(columns=[i],axis=1)

please, try this code

CodePudding user response:

I think your basic problem is the call to read the excel file.

If you skip early rows and define the index correctly6, you avoid the issue of having to remove float column headers altogether. so change your call to open the excel file to the following:

df_cities = pd.read_excel(url_cities, skiprows=16, index_col=0)

Which yields a df like the following:

    Country Code    Country or area City Code   Urban Agglomeration Note    Latitude    Longitude   1950    1955    1960    ... 1990    1995    2000    2005    2010    2015    2020    2025    2030    2035
Index                                                                                   
1   4   Afghanistan 20001   Herat   NaN 34.348170   62.199670   82.468  85.751  89.166  ... 183.465 207.190 233.991 275.678 358.691 466.703 605.575 752.910 897.041 1057.573
2   4   Afghanistan 20002   Kabul   NaN 34.528887   69.172460   170.784 220.749 285.352 ... 1549.320    1928.694    2401.109    2905.178    3289.005    3723.543    4221.532    4877.024    5737.138    6760.500
3   4   Afghanistan 20003   Kandahar    NaN 31.613320   65.710130   82.199  89.785  98.074  ... 233.243 263.395 297.456 336.746 383.498 436.741 498.002 577.128 679.278 800.461
4   4   Afghanistan 20004   Mazar-e Sharif  NaN 36.709040   67.110870   30.000  37.139  45.979  ... 135.153 152.629 172.372 206.403 283.532 389.483 532.689 681.531 816.040 962.262
  • Related