Home > Back-end >  Sum specific number of columns for each row with Pandas
Sum specific number of columns for each row with Pandas

Time:03-15

I have the dollowing dataframe:

   name  code     1   2  3  4  5  6  7 .........155    days
0  Lari  EH214    0   5  2  1  0  0  0           0      3
1  Suzi  FK362    0   0  0  0  2  3  0           0      108
2  Jil   LM121    0   0  4  2  1  0  0           0      5
...

I want to sum the column between column 1 to column with the number that appears on "days" , for example,
for row 1, I will sum 3 days-> 0 5 2
For row 2 108 days,
for row 3 5 days->0 4 2 1 0

How can I do something like this? Looking for method.

CodePudding user response:

For vectorized solution filter rows by positions first and get mask by compare days in numpy boroadasting, if not match replace 0 in DataFrame.where and last sum:

df1 = df.iloc[:, 2:-1]

m = df1.columns.astype(int).to_numpy() <= df['days'].to_numpy()[:, None]

df['sum'] = df1.where(m, 0).sum(axis=1)
print (df)
   name   code  1  2  3  4  5  6  7  155  days  sum
0  Lari  EH214  0  5  2  1  0  0  0    0     3    7
1  Suzi  FK362  0  0  0  0  2  3  0    0   108    5
2   Jil  LM121  0  0  4  2  1  0  0    0     5    7

CodePudding user response:

IIUC, use:

df['sum'] = df.apply(lambda r: r.loc[1: r['days']].sum(), axis=1)

or, if the column names are strings:

df['sum'] = df.apply(lambda r: r.loc['1': str(r['days'])].sum(), axis=1)

output:

   name   code  1  2  3  4  5  6  7  155  days  sum
0  Lari  EH214  0  5  2  1  0  0  0    0     3    7
1  Suzi  FK362  0  0  0  0  2  3  0    0   108    5
2   Jil  LM121  0  0  4  2  1  0  0    0     5    7
  • Related