I have already sort the two dataframes
city_future:
City Future_50
7 Atlanta 1
9 Bal Harbour 1
1 Chicago 8
6 Coalinga 1
independents_future:
City independents_100
14 Amarillo 1
10 Atlanta 2
18 Atlantic City 1
20 Austin 1
This is what I got so far:
city_future = future.loc[:,"City"].value_counts().rename_axis('City').reset_index(name='Future_50').sort_values('City')
city_independents = independents.loc[:,"City"].value_counts().rename_axis('City').reset_index(name='independents_100').sort_values('City')
hot_cities = pd.merge(city_independents,city_future)
hot_cities
I need to show all the cities in both dataframe, which are in different lentgh, and mark those cities not in the other dataframe by 0. I have no idea why my current output only shows 20 rows... which is in the form of:
City independents_100 Future_50
0 Atlanta 2 1
1 Bal Harbour 1 1
2 Chicago 15 8
Thank you for helping!
CodePudding user response:
I believe you can do this without creating the two helper dataframes using the merge method.
setting indicator=True will create a new column in the resulting dataframe that will tell you if the row appears in the left dataframe only (city_future), the right dataframe only (independents_future), or both
merged_df = city_future.merge(right=independents_future,
left_on='City',
right_on='City',
how='outer',
indicator=True
)
here is the pandas.DataFrame.merge refrence page
hope this helps :)