Using keys from a dict to look for the same "keys" in a pandas dataframe, then assign valu-CodePudding

I have a pandas dataframe with zipcodes. I also have a dictionary where keys = zipkode and values = regions

The dictionary

my_regions = {8361: 'Central region', 8381: 'Central region', 8462: 'North region', 8520: 'South region', 8530: 'Central region', 8541: 'South region'}

The dataframe has a col name

df["zipcode"]= [8462, 8361, 8381,8660,8530,8530]

I want to add a new col to the dataframe df with the dict values (region name), when the loop sees that zip code in dataframe is == to zipkode in dict.keys

I have tried this

my_regions_list = []

for keyname in my_regions:
    for zipcode in df.zipcode:
        if zipcode == my_regions.keys():
            my_regions_list.append(my_regions.values())
            # df["region"] = df.append(my_regions.values())
            df =df.insert(column="region", value = my_r)

The list is empty and this is not adding the new row to the existing dataframe...

I also tried to convert it to a dataframe but it makes no sense

df1 = pd.DataFrame(list(my_regions.items()),columns = ['ZIPCODE','REGIONNAME'])

CodePudding user response：

You can use .map():

df = pd.DataFrame({"zipcode": [8462, 8361, 8381, 8660, 8530, 8530]})

my_regions = {
    8361: "Central region",
    8381: "Central region",
    8462: "North region",
    8520: "South region",
    8530: "Central region",
    8541: "South region",
}


df["name"] = df["zipcode"].map(my_regions)
print(df)

Prints:

   zipcode            name
0     8462    North region
1     8361  Central region
2     8381  Central region
3     8660             NaN
4     8530  Central region
5     8530  Central region

CodePudding user response：

I'm going to keep this up, even though .map() is supposedly faster than .replace() simply because the results are different, and others may find one or the other more appropriate for their use-case.

Note that the main difference is that .replace() will leave the original value intact if no mapping was found, whereas .map() produces NaN for mappings that don't exist.

df["regions"] = df["zipcode"].replace(my_regions)

Demo:

In [5]: df
Out[5]:
   zipcode
0     8462
1     8361
2     8381
3     8660
4     8530
5     8530

In [6]: df["regions"] = df["zipcode"].replace(my_regions)

In [7]: df
Out[7]:
   zipcode         regions
0     8462    North region
1     8361  Central region
2     8381  Central region
3     8660            8660
4     8530  Central region
5     8530  Central region