Home > Blockchain >  Pandas add row for each cell where function is applied to that specific cell while other element of
Pandas add row for each cell where function is applied to that specific cell while other element of

Time:01-23

I want to loop through each element of a pandas dataframe row such that only that element is stressed (ie: it's multiplied by 10%) while the other elements of the row are kept equal.

I'm planning to use this for sensitivity analysis.

Example: df = pd.DataFrame({'AGE':[5,10],'POP':[100,200]})

AGE POP
5 100
10 200

Final desired output:

AGE POP
5 100
10 200
5*1.1 100
5 100*1.1
10*1.1 200
10 200*1.1

CodePudding user response:

If you have 2 columns, you can multiply with the [1, stress] and its reverse those columns, concatanate them while sorting to preserve multiplied column order. Lastly, prepend the original frame as well:

stress = 1.1
factor = [stress, 1]
pd.concat([df,
           pd.concat([df.mul(factor),
                      df.mul(factor[::-1])]).sort_index()
          ], ignore_index=True)

    AGE    POP
0   5.0  100.0
1  10.0  200.0
2   5.5  100.0
3   5.0  110.0
4  11.0  200.0
5  10.0  220.0

Generalizing to N columns could be via a comprehension:

def gen_factors(stress, N):
    for j in range(N):
        # make all-1s list, except j'th is `stress`
        f = [1] * N
        f[j] = stress
        yield f

stress = 1.1
N = len(df.columns)
pd.concat([df,
           pd.concat(df.mul(factor)
                     for factor in gen_factors(stress, N)).sort_index()
          ], ignore_index=True)

Example run for a 3-column frame:

>>> df

   AGE  POP  OTHER
0    5  100      7
1   10  200      8

>>> # output of above:

    AGE    POP  OTHER
0   5.0  100.0    7.0
1  10.0  200.0    8.0
2   5.5  100.0    7.0
3   5.0  110.0    7.0
4   5.0  100.0    7.7
5  11.0  200.0    8.0
6  10.0  220.0    8.0
7  10.0  200.0    8.8

CodePudding user response:

You can use a cross merge and concat:

pd.concat([df, 
           (df.merge(pd.Series([1.1, 1], name='factor'), how='cross')
              .pipe(lambda d: d.mul(d.pop('factor'), axis=0))
            )], ignore_index=True)

Output:

    AGE    POP
0   5.0  100.0
1  10.0  200.0
2   5.5  110.0
3   5.0  100.0
4  11.0  220.0
5  10.0  200.0
  • Related