I compiled a list of the top artists for every year across 14 years and I want to gather the top 7 for the 14 years combined so my idea was to gather them all in a dataframe then gather the most repeated artists for these years, but it didn't work out.
#Collecting the top 7 artists across the 14 years
artists = []
year = 2020
while year >= 2006:
TAChart = billboard.ChartData('Top-Artists', year = year)
artists.append(str(TAChart))
year -= 1
len(artists)
Artists = pd.DataFrame(artists)
n = 7
Artists.value_counts().index.tolist()[:n]
CodePudding user response:
You're very close - you just need to flatten your list of lists into a single list, then call value_counts:
artists_flat = [a for lst in artists for a in lst]
pd.Series(artists_flat).value_counts().head(n)
Your current code is counting the occurrences of entire lists (as strings), rather than individual artists. Also, note that I used head(n) rather than indexing, as this is more robust in case there are ties for the nth place spot.