Home > Software engineering >  Pyhton pandas data analyzing for if elif structure doesnt work properly
Pyhton pandas data analyzing for if elif structure doesnt work properly

Time:12-07

I chose the subject of data analysis for my graduation project, for this I downloaded the database showing the country vaccination data from kaggle. There were only countries in the main database in csv form, but I wanted to divide the countries into continents and show it in a more stylish way in the data visualization, but the algorithm  does not work properly, some data remains blank, some data is Asian and Asian. It is added in the form of Europe, I used different IDEs, but it did not help in the photo that i upload ı have a zwe iso code last of africa list but it didnt add into the continent some of them are correct some of them not...[enter image description here](https://i.stack.imgur.com/wGDUV.png)

HERE İS THE SOURCE CODE

import pandas as pd
asya1 =['AFG','ARM','AZE','BHR','BGD','BTN','BRN','KHM','CHN','CXR','CCK','IOT','GEO','HKG','IND','IDN','IRN','IRQ','ISR','JPN','JOR','KAZ','KWT','KGZ','LAO','LBN','MAC','MYS','MDV','MNG','MMR','NPL','PRK','OMN','PAK','PSE','PHL','QAT','SAU','SGP','KOR','LKA','SYR','TWN','TJK','THA','TUR','TKM','ARE','UZB','VNM','YEM']
avrupa = ['ALB','AND','AUT','BLR','BEL','BIH','BGR','HRV','CYP','CZE','DNK','EST','FRO','FIN','FRA','DEU','GIB','GRC','HUN','ISL','IRL','IMN','ITA','XKX','LVA','LIE','LTU','LUX','MKD','MLT','MDA','MCO','MNE','NLD','NOR','POL','PRT','ROU','RUS','SMR','SRB','SVK','SVN','ESP','SWE','CHE','UKR','GBR','VAT','RSB','OWID_ENG']
guneyamerika = ['ARG','BOL','BRA','CHL','COL','ECU','FLK','GUF','GUF','GUY','PRY','PER','SUR','URY','VEN']
kuzeyamerika =  ['AIA','ATG','ABW','BHS','BRB','BLZ','BMU','BES','VGB','CAN','CYM','CRI','CUB','CUW','DMA','DOM','SLV','GRL','GRD','GLP','GTM','HTI','HND','JAM','MTQ','MEX','SPM','MSR','ANT','KNA','NIC','PAN','PRI','BES','BES','SXM','KNA','LCA','SPM','VCT','TTO','TCA','USA','VIR']
afrika = ['DZA','AGO','SHN','BEN','BWA','BFA','BDI','CMR','CPV','CAF','TCD','COM','COG','COD','DJI','EGY','GNQ','ERI','SWZ','ETH','GAB','GMB','GHA','GIN','GNB','CIV','KEN','LSO','LBR','LBY','MDG','MWI','MLI','MRT','MUS','MYT','MAR','MOZ','NAM','NER','NGA','STP','REU','RWA','STP','SEN','SYC','SLE','SOM','ZAF','SSD','SHN','SDN','SWZ','TZA','TGO','TUN','UGA','COD','ZMB','TZA','ZWE']
avusturalya = ['ASM','AUS','NZL','COK','TLS','FSM','FJI','PYF','GUM','KIR','MNP','MHL','UMI','NRU','NCL','NZL','NIU','NFK','PLW','PNG','MNP','WSM','SLB','TKL','TON','TUV','VUT','UMI','WLF']
sasya1 =pd.Series(asya1)
savrupa=pd.Series(avrupa)
sguneyamerika=pd.Series(guneyamerika)
skuzeyamerika=pd.Series(kuzeyamerika)
safrika=pd.Series(afrika)
savusturalya=pd.Series(avusturalya)
dataset = pd.read_csv("country_vaccinations.csv")
y = 0
pd.options.mode.chained_assignment = None
dataset.insert(loc = 3,column = 'continent',value = '')
for i in dataset["iso_code"]:
    if sasya1.str.contains(i).any():
        dataset["continent"][y] = 'Asia'
        y =1
    elif savrupa.str.contains(i).any():
        dataset["continent"][y] = 'Europe'
        y =1
    elif sguneyamerika.str.contains(i).any():
        dataset["continent"][y] = 'South America'
        y =1
    elif skuzeyamerika.str.contains(i).any():
        dataset["continent"][y] = 'North America'
        y =1
    elif safrika.str.contains(i).any():
        dataset["continent"][y] = 'Africa'
        y =1
    elif savusturalya.str.contains(i).any():
        dataset["continent"][y] = 'Ocenia'
        y =1
with pd.ExcelWriter("deneme.xlsx") as writer:
    dataset.to_excel(writer,sheet_name = "Sayfa1")

enter image description here enter image description here as you see some data entries are correct but some of them wrong or blank i dont know what is causing this error maybe some of you have faced this error before.Thank you

CodePudding user response:

Please see pandas.DataFrame.apply function (http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html) to call function on every item in DF.

For example:

def GetContinent(iso_code):
    if sasya1.str.contains(iso_code).any():
        return 'Asia'
    #..... 

#call function on each row and set results into 'continent' column
dataset['continent'] = dataset.apply(lambda x: GetContinent(x['iso_code'], axis=1)
  • Related