I have a Population column with numbers and a Region column with locations. Im only using pandas. How would I go about finding the total population of a specific location (wellington) within the Region column?
Place = [data[‘Region’] == ‘Wellington’]
Place[data[‘Population’]]
an error came up
TypeError Traceback (most recent call last) Input In [70], in <cell line: 4>() 1 #Q1.e 3 Place = [data['Region']=='Wellington'] ----> 4 Place[data['Population']]
TypeError: list indices must be integers or slices, not Series
CodePudding user response:
In your example, "Population" is capitalized in the table, but not capitalized when you attempt to access the column. Also next time please post the error message! Maybe try:
Place[data["Population"]
CodePudding user response:
Try this:
data_groups = data.groupby("Region")['Population'].sum()
Output:
data_groups
Region
Northland 4750
Wellington 7580
WestCoast 1550
If you want to call some specific region, you can do:
data_groups.loc['WestCoast'] # 1550
CodePudding user response:
Use DataFrame.loc
with sum
:
Place = data.loc[data['Region'] == 'Wellington', 'Population'].sum()
print (Place)
7190
Another idea is convert Region
to index, select by Series.loc
and then sum
:
Place = data.set_index('Region')['Population'].loc['Wellington'].sum()
print (Place)
7190