I have the dollowing dataframe:
name code 1 2 3 4 5 6 7 .........155 days
0 Lari EH214 0 5 2 1 0 0 0 0 3
1 Suzi FK362 0 0 0 0 2 3 0 0 108
2 Jil LM121 0 0 4 2 1 0 0 0 5
...
I want to sum the column between column 1 to column with the number that appears on "days" , for example,
for row 1, I will sum 3 days-> 0 5 2
For row 2 108 days,
for row 3 5 days->0 4 2 1 0
How can I do something like this? Looking for method.
CodePudding user response:
For vectorized solution filter rows by positions first and get mask by compare days
in numpy boroadasting, if not match replace 0
in DataFrame.where
and last sum
:
df1 = df.iloc[:, 2:-1]
m = df1.columns.astype(int).to_numpy() <= df['days'].to_numpy()[:, None]
df['sum'] = df1.where(m, 0).sum(axis=1)
print (df)
name code 1 2 3 4 5 6 7 155 days sum
0 Lari EH214 0 5 2 1 0 0 0 0 3 7
1 Suzi FK362 0 0 0 0 2 3 0 0 108 5
2 Jil LM121 0 0 4 2 1 0 0 0 5 7
CodePudding user response:
IIUC, use:
df['sum'] = df.apply(lambda r: r.loc[1: r['days']].sum(), axis=1)
or, if the column names are strings:
df['sum'] = df.apply(lambda r: r.loc['1': str(r['days'])].sum(), axis=1)
output:
name code 1 2 3 4 5 6 7 155 days sum
0 Lari EH214 0 5 2 1 0 0 0 0 3 7
1 Suzi FK362 0 0 0 0 2 3 0 0 108 5
2 Jil LM121 0 0 4 2 1 0 0 0 5 7