I have data frame with like 1000 rows contain multiple city names and other columns.
I wanted to make another data frame that contains only 10 of most repeated cities and its other columns. i did this:
df_temp = df_city[df_city['city_name']].sort_values('city_name')['city_name'].head(10)
but it didn't work. can anyone tell me how ?
CodePudding user response:
This should do the job:
most_freq_cities = df_city['city_name'].sort_values(ascending=False).values[:10]
df_temp = df_city.loc[df_city['city_name'].isin(most_freq_cities)]
CodePudding user response:
You can get the most repeated unique cities with:
most_repeated_unique_cities = df_city['city_name'].value_counts()[:10].index.tolist()
And you can get a dataframe containing most repeated unique cities with:
df_city[df_city["city_name"].isin(most_repeated_unique_cities)]