I have a list of tuples with the first element of the tuple consisting of a city and state name separated by a comma and the second element containing the name of the county:
print(county_lookup)
[('Normal,Alabama', 'Madison County'), ('Birmingham,Alabama', 'Jefferson County'), ('Montgomery,Alabama', 'Montgomery County'), ('Huntsville,Alabama', 'Madison County'), ('Tuscaloosa,Alabama', 'Tuscaloosa County'), ('Alexander City,Alabama', 'Tallapoosa County'), ('Athens,Alabama', 'Limestone County')]
I was hoping to be able to use the list to create a new column in a pre-existing dataframe for 'county' data using the values already present in the list of tuples.
df_schools['county'] = a=[x[n] for x in county_lookup]
However I soon realized that df_schools already has a city_state column containing values similar to the first element of each tuple of the list county_lookup.
df_schools.city.city_state
0 Normal,Alabama
1 Birmingham,Alabama
2 Montgomery,Alabama
3 Huntsville,Alabama
4 Montgomery,Alabama
...
7698 Overland Park,Kansas
7699 Highland Heights,Ohio
7700 San Jose,California
7701 Lancaster,California
7702 San Antonio,Texas
I was hoping to ask if there was a way I could compare the first element of each tuple in the list to the city-state column in df_schools dataframe in order to create a new column 'county' with the corresponding information from the second element of each tuple from the country_lookup list.
CodePudding user response:
You can use the pd.merge
function:
df = pd.DataFrame(county_lookup, columns=['city_state', 'county'])
df_schools = df_schools.merge(df, how='left', on='city_state')
Now df_schools
has a new 'county' column (which might have empty values, if the lookup was not successful).
CodePudding user response:
you can turn it to a dict and map it on the column/Series:
df['city_state'].map(dict(country_lookup))