Home > Net >  Iterate over specific rows, sum results and store in new row
Iterate over specific rows, sum results and store in new row

Time:04-01

I have a DataFrame in which I have already defined rows to be summed up and store the results in a new row.

For example in Year 1990:

Category A B C D Year
E 147 78 476 531 1990
F 914 356 337 781 1990
G 117 874 15 69 1990
H 45 682 247 65 1990
I 20 255 465 19 1990

Here, the rows G - H should be summed up and the results stored in a new row. The same categories repeat every year from 1990 - 2019

I have already tried it with .iloc e.g. [4:8], [50:54] [96:100] and so on, but with iloc I can not specify multiple index. I can't manage to make a loop over the single years. Is there a way to sum the values in categories (G-H) for each year (1990 -2019)?

CodePudding user response:

If you are only interested in G/H, you can slice with isin combined with boolean indexing, then sum:

df[df['Category'].isin(['G', 'H'])].sum()

output:

Category      GH
A            162
B           1556
C            262
D            134
Year        3980
dtype: object

NB. note here the side effect of sum that combines the two "G"/"H" strings into one "GH".

Or, better, set Category as index and slice with loc:

df.set_index('Category').loc[['G', 'H']].sum()

output:

A        162
B       1556
C        262
D        134
Year    3980
dtype: int64

CodePudding user response:

I'm not sure the multiple index what you mean.

It usually appear after some group and aggregate function.

At your table, it looks just multiple column

So, if I understand correctly.

Here a complete code to show how to use the multiple condition of DataFrame

import io

import pandas as pd

data = """Category  A   B   C   D   Year
E   147 78  476 531 1990
F   914 356 337 781 1990
G   117 874 15  69  1990
H   45  682 247 65  1990
I   20  255 465 19  1990"""

table = pd.read_csv(io.StringIO(data), delimiter="\t")

years = table["Year"].unique()
for year in years:
    row = table[((table["Category"] == "G") | (table["Category"] == "H")) & (table["Year"] == year)]
    row = row[["A", "B", "C", "D"]].sum()
    row["Category"], row["Year"] = "sum", year
    table = table.append(row, ignore_index=True)

  • Related