I want to loop through each element of a pandas dataframe row such that only that element is stressed (ie: it's multiplied by 10%) while the other elements of the row are kept equal.
I'm planning to use this for sensitivity analysis.
Example:
df = pd.DataFrame({'AGE':[5,10],'POP':[100,200]})
AGE | POP |
---|---|
5 | 100 |
10 | 200 |
Final desired output:
AGE | POP |
---|---|
5 | 100 |
10 | 200 |
5*1.1 | 100 |
5 | 100*1.1 |
10*1.1 | 200 |
10 | 200*1.1 |
CodePudding user response:
If you have 2 columns, you can multiply with the [1, stress] and its reverse those columns, concatanate them while sorting to preserve multiplied column order. Lastly, prepend the original frame as well:
stress = 1.1
factor = [stress, 1]
pd.concat([df,
pd.concat([df.mul(factor),
df.mul(factor[::-1])]).sort_index()
], ignore_index=True)
AGE POP
0 5.0 100.0
1 10.0 200.0
2 5.5 100.0
3 5.0 110.0
4 11.0 200.0
5 10.0 220.0
Generalizing to N columns could be via a comprehension:
def gen_factors(stress, N):
for j in range(N):
# make all-1s list, except j'th is `stress`
f = [1] * N
f[j] = stress
yield f
stress = 1.1
N = len(df.columns)
pd.concat([df,
pd.concat(df.mul(factor)
for factor in gen_factors(stress, N)).sort_index()
], ignore_index=True)
Example run for a 3-column frame:
>>> df
AGE POP OTHER
0 5 100 7
1 10 200 8
>>> # output of above:
AGE POP OTHER
0 5.0 100.0 7.0
1 10.0 200.0 8.0
2 5.5 100.0 7.0
3 5.0 110.0 7.0
4 5.0 100.0 7.7
5 11.0 200.0 8.0
6 10.0 220.0 8.0
7 10.0 200.0 8.8
CodePudding user response:
You can use a cross merge
and concat
:
pd.concat([df,
(df.merge(pd.Series([1.1, 1], name='factor'), how='cross')
.pipe(lambda d: d.mul(d.pop('factor'), axis=0))
)], ignore_index=True)
Output:
AGE POP
0 5.0 100.0
1 10.0 200.0
2 5.5 110.0
3 5.0 100.0
4 11.0 220.0
5 10.0 200.0