Please help, I have a dataset in which I have columns Country, Gas and Year from 2019 to 1991. Also attaching the snapshot of the dataset. I want to answer a question that I want to add all the values of a country column wise? For example, for Afghanistan, value should come 56.4 under 2019 (adding 28.79 6.23 16.37 5.01 = 56.4). Now I want it should calculate the result for every year. I have used below code for achieving 2019 data.
df.groupby(by='Country')['2019'].sum()
This is the output of that code:
Country
---------------------
Afghanistan 56.40
Albania 17.31
Algeria 558.67
Andorra 1.18
Angola 256.10
...
Venezuela 588.72
Vietnam 868.40
Yemen 50.05
Zambia 182.08
Zimbabwe 235.06
I have group the data country wise and adding the 2019 column values, but how should I add values of other years in single line of code?
Please help.
I can do the code shown here, to add rows and show multiple columns like this but this will be tedious task to do so write each column name.
df.groupby(by='Country')[['2019','2018','2017']].sum()
CodePudding user response:
If you don't specify the column, it will sum all the numeric column.
df.groupby(by='Country').sum()
2019 2020 ...
Country
Afghanistan 56.40 32.4 ...
Albania 17.31 12.5 ...
Algeria 558.67 241.5 ...
Andorra 1.18 1.5 ...
Angola 256.10 32.1 ...
... ... ...
Venezuela 588.72 247.3 ...
Vietnam 868.40 323.5 ...
Yemen 50.05 55.7 ...
Zambia 182.08 23.4 ...
Zimbabwe 235.06 199.4 ...
Do a reset_index()
to flatten the columns
df.groupby(by='Country').sum().reset_index()
Country 2020 2019 ...
Afghanistan 56.40 32.4 ...
Albania 17.31 12.5 ...
Algeria 558.67 241.5 ...
Andorra 1.18 1.5 ...
Angola 256.10 32.1 ...
... ... ...
Venezuela 588.72 247.3 ...
Vietnam 868.40 323.5 ...
Yemen 50.05 55.7 ...
Zambia 182.08 23.4 ...
Zimbabwe 235.06 199.4 ...
CodePudding user response:
You can select columns keys in your dataframe starting from column 2019
till the last column key in this way:
df.groupby(by='Country')[df.keys()[2:]].sum()
Method df.keys
will return all dataframe columns keys in a list then you can slice it from the index of 2019
key which is 2
till end of columns keys.