Home > Software engineering >  how to convert [('US', 144), ('CA', 37)] to ['US', 'CA']
how to convert [('US', 144), ('CA', 37)] to ['US', 'CA']

Time:03-01

I have a df with customer and country data. I want to count the countries so can find the top5 countries, and them use that as a filter elsewhere.

this gives me the counts

countries = collections.Counter(responses_2021['country'].dropna())

that yields this

[('US', 144), ('CA', 37), ('GB', 15), ('FR', 15), ('AU', 12)]

and this gives me the top 5

countries_top5 = countries.most_common(5)

now I need to transform it into a more simple structure so I can do my filter (here i'm just typing it manually because that's the only way I could move forward lol)

options = ['US', 'CA', 'GB', 'FR', 'AU'] 
rslt_df = df[df['country'].isin(options)] 

So, to get from the this

[('US', 144), ('CA', 37), ('GB', 15), ('FR', 15), ('AU', 12)]

to this

['US', 'CA', 'GB', 'FR', 'AU'] 

I started by trying to remove the counts

countries_top5_names = np.delete(countries_top5, 1, 1)

but that yields

[['US'], ['CA'], ['GB'], ['FR'], ['AU']] 

so now I'm trying to flatten that, but I don't know how.

better way?

SOLUTION (thanks to @dan04 below)

countries_top5_names = [x[0] for x in countries_top5] 
rslt_df = df[df['country'].isin(countries_top5_names)] 

CodePudding user response:

Just take element [0] of each tuple.

>>> data = [('US', 144), ('CA', 37), ('GB', 15), ('FR', 15), ('AU', 12)]
>>> countries = [x[0] for x in data]
>>> countries
['US', 'CA', 'GB', 'FR', 'AU']

CodePudding user response:

You can try more universal method to do this.

data = [('US', 144), ('CA', 37), ('GB', 15), ('FR', 15), ('AU', 12)]
groups = list(zip(*data))
print(groups[0])
print(groups[1])

Output:
('US', 'CA', 'GB', 'FR', 'AU')
(144, 37, 15, 15, 12)

  • Related