I have data as follows:
import pandas as pd
url_cities="https://population.un.org/wup/Download/Files/WUP2018-F12-Cities_Over_300K.xls"
df_cities = pd.read_excel(url_cities)
i = df_cities.iloc[:, 1].notna().idxmax()
df_cities.columns = df_cities.iloc[i].tolist()
df_cities = df_cities.iloc[i 1:]
df_cities = df_cities.rename(columns={2020.0: 'City_pop'})
print(df_cities.iloc[0:20,])
I want to remove all columns for which the column names (NOT COLUMN VALUES) are floats.
I have looked at a couple of links (A, B, C), but I could not find the answer. Any suggestions?
CodePudding user response:
This will do what your question asks:
df = df[[col for col in df.columns if not isinstance(col, float)]]
Example:
import pandas as pd
df = pd.DataFrame(columns=['a',1.1,'b',2.2,3,True,4.4,'c'],data=[[1,2,3,4,5,6,7,8],[11,12,13,14,15,16,17,18]])
print(df)
df = df[[col for col in df.columns if not isinstance(col, float)]]
print(df)
Initial dataframe:
a 1.1 b 2.2 3 True 4.4 c
0 1 2 3 4 5 6 7 8
1 11 12 13 14 15 16 17 18
Result:
a b 3 True c
0 1 3 5 6 8
1 11 13 15 16 18
Note that 3
is an int, not a float, so its column has not been removed.
CodePudding user response:
my_list=list(df_cities.columns)
for i in my_list:
if type(i)!=str:
df_cities=df_cities.drop(columns=[i],axis=1)
please, try this code
CodePudding user response:
I think your basic problem is the call to read the excel file.
If you skip early rows and define the index correctly6, you avoid the issue of having to remove float column headers altogether. so change your call to open the excel file to the following:
df_cities = pd.read_excel(url_cities, skiprows=16, index_col=0)
Which yields a df like the following:
Country Code Country or area City Code Urban Agglomeration Note Latitude Longitude 1950 1955 1960 ... 1990 1995 2000 2005 2010 2015 2020 2025 2030 2035
Index
1 4 Afghanistan 20001 Herat NaN 34.348170 62.199670 82.468 85.751 89.166 ... 183.465 207.190 233.991 275.678 358.691 466.703 605.575 752.910 897.041 1057.573
2 4 Afghanistan 20002 Kabul NaN 34.528887 69.172460 170.784 220.749 285.352 ... 1549.320 1928.694 2401.109 2905.178 3289.005 3723.543 4221.532 4877.024 5737.138 6760.500
3 4 Afghanistan 20003 Kandahar NaN 31.613320 65.710130 82.199 89.785 98.074 ... 233.243 263.395 297.456 336.746 383.498 436.741 498.002 577.128 679.278 800.461
4 4 Afghanistan 20004 Mazar-e Sharif NaN 36.709040 67.110870 30.000 37.139 45.979 ... 135.153 152.629 172.372 206.403 283.532 389.483 532.689 681.531 816.040 962.262