Home > Back-end >  How to calculate the average of the newly added row
How to calculate the average of the newly added row

Time:12-17

import numpy as np
import pandas as pd
import os
import xlsxwriter

original_data = pd.read_excel('purchase_record.xlsx')
original_data.head()
df = pd.DataFrame(original_data)
filter_data = df.set_index(['item', 'Region', 'City Code', 'Area', 'City Name',
                        'price']).apply(lambda x: x.str.split('/').explode()).reset_index()

now my first row has shop name that is marketA/marketB with the same price 100

i want to make it to two rows and the price divided by the number of market in shop name

   Shop Name      item    Region  City Code   Area  City Name  price
MarketA/MarketB   Apple     A       10201     East      B        100
    MarketC       Grape     B       20010     West      C         90
MarketD/MarketE   Orange    C       30132     North     D         84
  1. MarketA Apple A 10201 East B 50
  2. MarketB Apple A 10201 East B 50
  3. MarketC Grape B 20010 West C 90
  4. MarketD Orange C 30132 North D 42
  5. MarketE Orange C 30132 North D 42

CodePudding user response:

Basing on numpy.where (check condition for "jointed" shops) and pandas.DataFrame.explode method:

cond = df['Shop Name'].str.contains('/')
df['price'] = np.where(cond, df['price'] / 2, df['price'])
df['Shop Name'] = np.where(cond, df['Shop Name'].str.split('/'), df['Shop Name'])
df = df.explode('Shop Name')
print(df)

The output:

 Shop Name    item Region  City Code   Area City Name  price
0   MarketA   Apple      A      10201   East         B   50.0
0   MarketB   Apple      A      10201   East         B   50.0
1   MarketC   Grape      B      20010   West         C   90.0
2   MarketD  Orange      C      30132  North         D   42.0
2   MarketE  Orange      C      30132  North         D   42.0
  • Related