Home > Software design >  how to sum values in a new column, based on conditions and occurrence of the same value in col (valu
how to sum values in a new column, based on conditions and occurrence of the same value in col (valu

Time:04-17

I'm trying to find a way to update values in a new column having written a piece of code that in every step (row by row) displays the sum of buy/sell orders with the best price.

stock_buy_sell = {
    "Id":[1, 2, 3, 4, 3, 5],
    "Order":["Buy", "Sell", "Buy", "Buy", "Buy", "Sell"],
    "Type":["Add", "Add", "Add", "Add", "Remove", "Add"],
    "Price":[21.0, 25.0, 23.0, 23.0, 23.0, 28],
    "Quantity":[100, 200, 50, 70, 50, 100]}

    Id  Order   Type    Price   Quantity
0   1   Buy     Add     21.0    100
1   2   Sell    Add     25.0    200
2   3   Buy     Add     23.0    50
3   4   Buy     Add     23.0    70
4   3   Buy     Remove  23.0    50
5   5   Sell    Add     28.0    100

Because of updates that may occur for a particular id, I need to find a way to use this factor to proper populate new columns: Sum of income and Stock quantity.


    Id  Order   Type    Price   Quantity    Sum Of Income   Stock Quantity  Total Profit
0   1   Buy     Add     21.0    100               0                 0                  0
1   2   Sell    Add     25.0    200               0                 0                  0
2   3   Buy     Add     23.0    50                0                 0                  0
3   4   Buy     Add     23.0    70                0                 0                  0
4   3   Buy     Remove  23.0    50                0                 0                  0
5   5   Sell    Add     28.0    100               0                 0                  0

In this simple example, besides the fact that I need to compute Sum of income and Stock quantity based on buy/sell actions according to previous rows (row after row), the problem occurs in the 4th row where id: 3 should be based on the id in 2nd row. In other words, to properly populate Sum of income and Stock quantity I need to find a way to subtract price and quantity values based on a function that will be triggered when some id existed previously.

I try to find a way to do it with df.apply(), pd.series.apply(). I, also, looked at the possibility of implementing the pd.shift method. But, I can't figure out how to build the logic and with what method.

Expected output (I count it manually):

 Id  Order   Type Price Quantity Sum of Income Stock Quantity Total Profit
1 1   Buy     Add    21      100           -21            100        -2100
2 2  Sell     Add    25      200             4           -100         5000
3 3   Buy     Add    23       50           -19            -50        -1150
4 4   Buy     Add    23       70           -42             20        -1610
5 3   Buy  Remove    23       50           -19            -30         1150
6 5  Sell     Add    28      100             9           -130         2800

====================================================================

The following part of my post is not directly relevant to the question, so it may be omitted by those answering. The following part is a solution to the problem for a situation in which we get input as subsequent dictionary-type objects and - right away - we can build a complete database (the same as in the question).

In other words, at the beginning I have no data, the shareholder performs a buy / sell action (first step), eg

apples_dct1 = {1: [" Buy "," Add ", 20.0, 100]}.

Then comes the next step:

apples_dct2 = {2: ["Sell", "Add", 25.0, 200]}

ect.

import pandas as pd

apples_dct1 = {1:["Buy", "Add", 21.0, 100]}
apples_dct2 = {2:["Sell", "Add", 25.0, 200]}
apples_dct3 = {3:["Buy", "Add", 23.0, 50]}
apples_dct4 = {4:["Buy", "Add", 23.0, 70]}
apples_dct5 = {3:["Buy", "Remove", 23.0, 50]}
apples_dct6 = {5:["Sell", "Add", 28.0, 100]}

engine_dict = {}
def magic_engine(dict_apples):
    """
    creating objects from dict_apples:
    """
    dict_key = list(dict_apples.keys())[0]
    order = dict_apples[dict_key][0]
    type_buy_sell = dict_apples[dict_key][1]
    price = dict_apples[dict_key][2]
    quantity = dict_apples[dict_key][3]
#     print(dict_key)
#     print("dict_key[1] ", dict_apples[dict_key][1]) # test
    """
    First instance of data in a new dict `engine_dict`:
    """
    if (bool(engine_dict) == False and 
        dict_apples[dict_key][1] == "Add" and 
        dict_apples[dict_key][0] == "Buy"):
            
            sum_of_income_extend = -price
            stock_quantity_extended = quantity
            profit_extended = -(price * quantity)
            base_list = [
                order,
                type_buy_sell,
                price,
                quantity,
                sum_of_income_extend,
                stock_quantity_extended,
                profit_extended
            ]
    #         print("base_list ", base_list)
            engine_dict[dict_key] = base_list
    #         print(engine_dict) # Test
            return engine_dict
    
    elif (bool(engine_dict) == False and 
        dict_apples[dict_key][1] == "Add" and 
        dict_apples[dict_key][0] == "Sell"):
            
            sum_of_income_extend = price
            stock_quantity_extended = quantity
            profit_extended = price * quantity
            base_list = [
                order, type_buy_sell,
                price,
                quantity,
                sum_of_income_extend,
                stock_quantity_extended,
                profit_extended
            ]
    #         print("base_list ", base_list)
            engine_dict[dict_key] = base_list
    #         print(engine_dict) # Test
            return engine_dict
    
    """
    Adding new key-value pairs to `engine_dict` 
    where 
    `update_sum_of_income_extend`,
    `stock_quantity_extend`,
    `profit_extended`
    are based on the previous `engine_dict` key. 
    With that, we can update the income, 
    stock quantity and total profit for stock holder.
    """
    if (bool(engine_dict) == True and 
        dict_apples[dict_key][1] == "Add" and 
        dict_apples[dict_key][0] == "Buy"):
        
        update_sum_of_income_extend = (
            engine_dict[list(engine_dict.keys())[-1]][4] - (price)
        )
        stock_quantity_extend = (
        engine_dict[list(engine_dict.keys())[-1]][5]   quantity
        )
        profit_extended = -(price * quantity)
        base_list = [
            order,
            type_buy_sell,
            price,
            quantity,
            update_sum_of_income_extend,
            stock_quantity_extend,
            profit_extended
        ]
#         print("base_list ", base_list)
        engine_dict[dict_key] = base_list
#         print(engine_dict) # Test
        return engine_dict

    elif (bool(engine_dict) == True and
          dict_apples[dict_key][1] == "Add" and 
          dict_apples[dict_key][0] == "Sell"):
        
        update_sum_of_income_extend = (
                engine_dict[list(engine_dict.keys())[-1]][4]   (price)
            )
        stock_quantity_extend = (
            engine_dict[list(engine_dict.keys())[-1]][5] - quantity
        )
        profit_extended = price * quantity
#         print("engine_dict[list(engine_dict.keys())[-1]][2] ", engine_dict[list(engine_dict.keys())[-1]][2])
#         print("price ", price)
        base_list = [
            order,
            type_buy_sell,
            price,
            quantity,
            update_sum_of_income_extend,
            stock_quantity_extend,
            profit_extended
        ]
        engine_dict[dict_key] = base_list
        return engine_dict
    
    elif (bool(engine_dict) == True and
          dict_apples[dict_key][1] == "Remove" and 
          dict_apples[dict_key][0] == "Buy"):

        
        update_sum_of_income_extend = (
            engine_dict[list(engine_dict.keys())[-1]][4]   (price)
            )
        stock_quantity_extend = (
            engine_dict[list(engine_dict.keys())[-1]][5] - quantity
            )
        profit_extended = price * quantity
#         print("engine_dict[list(engine_dict.keys())[-1]][2] ", engine_dict[list(engine_dict.keys())[-1]][2])
#         print("price ", price)
        base_list = [
            order,
            type_buy_sell,
            price,
            quantity,
            update_sum_of_income_extend,
            stock_quantity_extend,
            profit_extended
        ]
        """
        Because a dictionary can have just unique keys, for "removing action"
        I create a new key build: key   instance number of action.
        With that, it will be easy to find all removing actions (they will be floats)
        If there would be more "removing action" instances, then I will have for example:
        main key 3
        first "removing action" with key 3.1
        second "removing action" with key 3.2
        third "removing action" with key 3.3
        ect.
        """
        for i in list(engine_dict.keys())[:]:
            if i == dict_key:
                dict_key = dict_key   0.1
                engine_dict[dict_key] = base_list
        
        return engine_dict
    
"""
Below I have all the steps taken by the shareholder
"""    
magic_engine(apples_dct1)
magic_engine(apples_dct2)
magic_engine(apples_dct3)
magic_engine(apples_dct4)
magic_engine(apples_dct5)
magic_engine(apples_dct6)

"""
Based on a dictionary that includes all shareholder activities, 
I am building a dataframe in Pandas:
"""

df_col = [
    'Order',
    'Type',
    'Price',
    'Quantity',
    'Sum of income',
    'Stock quantity',
    'total profit'
]

new_table_buy_sell = pd.DataFrame(engine_dict)
final_table = new_table_buy_sell.transpose()
final_table.set_index([pd.Index([1,2,3,4,5,6]), list(engine_dict.keys())], inplace=True)
final_table.columns = df_col
final_table.columns = final_table.columns.rename("id")
final_table

Output:

  Id    Order  Type Price Quantity Sum Of Income Stock Quantity Total Profit
1 1.0   Buy     Add    21      100           -21            100        -2100
2 2.0  Sell     Add    25      200             4           -100         5000
3 3.0   Buy     Add    23       50           -19            -50        -1150
4 4.0   Buy     Add    23       70           -42             20        -1610
5 3.1   Buy  Remove    23       50           -19            -30         1150
6 5.0  Sell     Add    28      100             9           -130         2800

CodePudding user response:

We could use a mapping dictionary to use "Order" and "Type" as calculating the cumulative price and quantity (which we calculate using cumsum). Finally, assign the "Total" column by multiplying the "Quantity" by the cumulative price (which is renamed "Sum of income"):

order_mapping = {'Buy': 1, 'Sell': -1}
type_mapping = {'Add': 1, 'Remove': -1}

df = (df.join(df[['Price','Quantity']]
             .mul(df['Order'].map(order_mapping) * df['Type'].map(type_mapping), axis=0)
             .assign(Price=lambda x: -x['Price'])
             .cumsum()
             .rename(columns={'Price':'Sum of income', 'Quantity':'Stock quantity'}))
      .assign(Total=lambda x: x['Quantity']*x['Price']))

Output:

   Id Order    Type  Price  Quantity  Sum of income  Stock quantity   Total
0   1   Buy     Add   21.0       100          -21.0             100  -2100.0
1   2  Sell     Add   25.0       200            4.0            -100   5000.0
2   3   Buy     Add   23.0        50          -19.0             -50   1150.0
3   4   Buy     Add   23.0        70          -42.0              20   1610.0
4   3   Buy  Remove   23.0        50          -19.0             -30   1150.0
5   5  Sell     Add   28.0       100            9.0            -130   2800.0

The general idea is that we want to use the "Order" column to determine whether we want to add or subtract values as we find the cumulative sums of "Price" and "Quantity". That's what we're doing with map mul. Then after we find the cumulative sums of these columns (note that cumulative sum works on a particular column), we find the total by multiplying two columns (this uses two columns).

  • Related