Home > Software design >  Pandas groupby sum based on conditions of other columns
Pandas groupby sum based on conditions of other columns

Time:09-29

I have the following dataframe:

df = pd.DataFrame({'Position':['Entry','Partial','Partial','Partial','Entry','Partial','Partial','Entry','Partial'],
               'Symbol':['AA','AA','AA','AA','BB','BB','BB','CC','CC'],
               'Action':['Sell','Buy','Buy','Buy','Buy','Sell','Sell','Sell','Buy'],
               'Quantity':['4','2','1','1','2','1','1','1','1'],
               'Price':['2.1','1.5','2.2','1','4.6','5.1','4.5','1','1.1']})

enter image description here

I am attempting to calculate profit/loss in another column for each unique symbol. This value is dependent on which side the entry was on, in order to determine if profit will equal sell - buy value, or buy - sell value. I am having difficulty figuring out how to sum these values up to the next symbol/entry indicator.

My attempts have included using the numpy.where function, as well as the dataframe.assign function. Could someone help me understand how I can go about accomplishing these conditional based summation problems?

I'm attempting an output as follows, where I carry over the unique symbol strings with their relative profit/loss output.

Expected Output

CodePudding user response:

The exact expected output is unclear, but assuming you want a new column summing the Sell-Buy per Symbol:

# ensure we have numeric values and not strings
df[['Quantity', 'Price']] = df[['Quantity', 'Price']].apply(pd.to_numeric)

df['profit/loss'] = (df['Action']
 .map({'Buy': -1, 'Sell': 1})    # make Buy -1 and Sell 1
 .mul(df['Price'])               # make Buy negative values
 .mul(df['Quantity'])            # multiply by Quantity
 .groupby(df['Symbol'])
 .transform('sum')               # sum per group as new column
)

output:

  Position Symbol Action  Quantity  Price  profit/loss
0    Entry     AA   Sell         4    2.1          2.2
1  Partial     AA    Buy         2    1.5          2.2
2  Partial     AA    Buy         1    2.2          2.2
3  Partial     AA    Buy         1    1.0          2.2
4    Entry     BB    Buy         2    4.6          0.4
5  Partial     BB   Sell         1    5.1          0.4
6  Partial     BB   Sell         1    4.5          0.4
7    Entry     CC   Sell         1    1.0         -0.1
8  Partial     CC    Buy         1    1.1         -0.1

If you don't want a new column but rather a summary:

(df['Action']
 .map({'Buy': -1, 'Sell': 1})
 .mul(df['Price']).mul(df['Quantity'])
 .groupby(df['Symbol']).sum()
)

output:

Symbol
AA    2.2
BB    0.4
CC   -0.1
dtype: float64
  • Related