I have 2 tables: 1) "first_day" which contains date, location, barcode(sku) and inventory(quantity). location is a store and sku is an item. There is no duplicates among location-sku combination. date represents only one date 01/01/2020. 2) "transactions" which contains date, location, sku, transaction type, quantity. dates are from 01/02/2020 onwards. transaction types can be "in", "out" and "sale" which i converted to 1, -1, -1 respectively and then multiplied with quantity column to create "total" column.
first_day table
index | date | location | sku | inventory |
---|---|---|---|---|
0 | 01/01/20 | 02 | 2518 | 0 |
1 | 01/01/20 | 105 | 425 | 1 |
2 | 01/01/20 | 02 | 667 | 1 |
transaction table
index | date | location | sku | total |
---|---|---|---|---|
0 | 01/02/20 | 02 | 2518 | 2 |
1 | 01/03/20 | 02 | 2518 | -2 |
2 | 01/04/20 | 02 | 2518 | 8 |
3 | 01/05/20 | 02 | 2518 | -3 |
4 | 01/02/20 | 02 | 667 | 12 |
5 | 01/03/20 | 02 | 667 | 4 |
6 | 01/05/20 | 02 | 667 | -2 |
7 | 01/06/20 | 02 | 667 | 3 |
I use first_day table for the unique locations and unique sku. And for the initial quantity of each item from each location. My mission is to iterate over transaction table and append or subtract the quantity to/from the initial quantity from first_day table. result table for example:
index | location | sku | date | quantity |
---|---|---|---|---|
0 | 02 | 2518 | 01/02/20 | 2 # 2 |
1 | 02 | 2518 | 01/03/20 | 0 # -2 |
2 | 02 | 2518 | 01/04/20 | 8 # 8 |
3 | 02 | 2518 | 01/05/20 | 5 # -3 |
4 | 02 | 667 | 01/02/20 | 13 # 12 |
5 | 02 | 667 | 01/03/20 | 17 # 4 |
6 | 02 | 667 | 01/05/20 | 15 # -2 |
7 | 02 | 667 | 01/06/20 | 18 # 3 |
My code is not working well, I only get one line that includes the first location from locations, first sku from skus, quantity of zero and date 02/01/2020.
first_day = pd.read_csv(r"C:\Users\Inventory_First_day.csv")
master = pd.read_csv(r"C:\Users\Master_skus.csv")
transactions = pd.read_csv(r"C:\Users\Transactions.csv")
transactions['Transaction Type'] = transactions['Transaction Type'].replace({'SALE': -1, 'OUT': -1, 'IN': 1})
transactions['total'] = transactions['Transaction Type'] * transactions['Quantity']
locations = np.unique(first_day['Stock Location Name'])
skus = np.unique(first_day['SKU'])
def skulocation(first_day, transactions, loc, sk):
for location in locations:
for sku in skus:
data = first_day[first_day['Stock Location Name'] == location] #filtering for location
data = data[data['SKU'] == sku] #filtering for sku
if data.empty: #some combintaion not exists in first_day
data.loc[0,'Date'] = first_day['Date'].iloc[0]
data.loc[0,'Stock Location Name'] = location
data.loc[0,'SKU'] = skus[0]
data.loc[0,'Inventory'] = 0
else:
pass
data2 = transactions[transactions['Stock Location Name'] == location] #filtering for location
data2 = data2[data2['SKU'] == sku][['Date', 'Stock Location Name', 'SKU', 'total']] #filtering for sku
if data2.empty: #some combintaion not exists in transactions
data2.loc[0,'Date'] = transactions['Date'].iloc[0]
data2.loc[0,'Stock Location Name'] = location
data2.loc[0,'SKU'] = sku
data2.loc[0,'total'] = 0
else:
pass
new_frame = pd.DataFrame(columns=['location', 'sku','date', 'quantity']) # create new dataframe
new_frame.loc[0,'quantity'] = data['Inventory'].iloc[0] #initializing with value from first_day
for i in range(len(data2)):
new_frame.loc[i,'location'] = data2['Stock Location Name'].iloc[i]
new_frame.loc[i, 'sku'] = data2['SKU'].iloc[i]
new_frame.loc[i,'date'] = data2['Date'].iloc[i]
new_frame.loc[i, 'quantity'] = new_frame['quantity'].iloc[i] data2['total'].iloc[i] #add/substract quantity
return new_frame
new_frame = skulocation(first_day, transactions, locations, skus)
Thank you guys
P.S. i edit a little bit the result table. from the first_day table i need only the quantity of each item at a different location. for example i have chair (item 1) in the following locations: A, B, D, J, Q. each location has different amount from this chair. Now i go to the transaction table, and for each locations i follow how the amount of this chair has changed over time.
let's say my starting point is -5. i have array with values, each place of value represents a date. [3, 1, -6, 8]. at first day i have -5 3 =-2. then -2 1= -1, then -1 -6= -7, then -7 8= 1. [-2,-1,-7, 1]
CodePudding user response:
Is this what you're looking for? If not, please add more data. I added some data to the transaction table based on what I understood your ask to be.
import io
data1='''index date location sku inventory(quantity)
0 01/01/20 02 2518 0
1 01/01/20 105 425 1'''
df_first_day = pd.read_csv(io.StringIO(data1), sep=' \s ', engine='python')
df_first_day
data2='''
index date location sku transaction type quantity
0 01/02/20 12 2518 sale 14
1 01/02/20 145 4579 in 5
2 01/04/20 02 2518 out 1
3 01/05/20 02 2518 in 20
4 01/06/20 02 2518 in 1
5 01/08/20 02 2518 sale 5
6 01/10/20 02 2518 out 1
7 01/12/20 02 2518 sale 3
'''
df_trx = pd.read_csv(io.StringIO(data2), sep=' \s ', engine='python')
df_trx['inventory(quantity)'] = np.where(df_trx['transaction type']=='in', df_trx['quantity'], -df_trx['quantity'])
df_trx
df_adj = pd.concat([df_first_day, df_trx[['index', 'date', 'location', 'sku', 'inventory(quantity)']]]).sort_values(['sku', 'location', 'date'])
df_adj['quantity'] = df_adj.groupby(['sku', 'location'])['inventory(quantity)'].cumsum()
df_adj
index date location sku inventory(quantity) quantity
1 1 01/01/20 105 425 1 1
0 0 01/01/20 2 2518 0 0
2 2 01/04/20 2 2518 -1 -1
3 3 01/05/20 2 2518 20 19
4 4 01/06/20 2 2518 1 20
5 5 01/08/20 2 2518 -5 15
6 6 01/10/20 2 2518 -1 14
7 7 01/12/20 2 2518 -3 11
0 0 01/02/20 12 2518 -14 -14
1 1 01/02/20 145 4579 5 5
Extra If you are looking for just current inventory, you can do something like this
df_current_inv = df_adj.groupby(['sku', 'location']).last().reset_index()
CodePudding user response:
I think the problem was specifically the updating of the new_frame variable:
def skulocation(df, df2, loc, sk, new_frame = None, **kwargs):
for location in locations:
for sku in skus:
data = df[df['Stock Location Name'] == location] #filtering for location
data = data[data['SKU'] == sku] #filtering for sku
if data.empty: #some combintaion not exists in first_day
data.loc[0,'Date'] = df['Date'].iloc[0]
data.loc[0,'Stock Location Name'] = location
data.loc[0,'SKU'] = skus[0]
data.loc[0,'Inventory'] = 0
else:
pass
data2 = df2[df2['Stock Location Name'] == location] #filtering for location
data2 = data2[data2['SKU'] == sku][['Date', 'Stock Location Name', 'SKU', 'total']] #filtering for sku
if data2.empty: #some combintaion not exists in transactions
data2.loc[0,'Date'] = df2['Date'].iloc[0]
data2.loc[0,'Stock Location Name'] = location
data2.loc[0,'SKU'] = sku
data2.loc[0,'total'] = 0
else:
pass
column_names = ['location', 'sku','date', 'quantity']
if new_frame is None:
new_frame = pd.DataFrame({column_names[0]: data2['Stock Location Name'].iloc[0], column_names[1]: data2['SKU'].iloc[0], column_names[2]: data2['Date'].iloc[0], column_names[3]: data2['Inventory'].iloc[0]})
else:
i = len(data2)
new_frame = pd.DataFrame({column_names[0]: data2['Stock Location Name'].iloc[0:i], column_names[1]: data2['SKU'].iloc[0:i], column_names[2]: data2['Date'].iloc[0:i], column_names[3]: str(int(new_frame[column_names[3]].iloc[i]) int(data2['total'].iloc[i]))})
return new_frame
new_frame = skulocation(first_day, transactions, locations, skus)
CodePudding user response:
A very simple solution is to concatenate your two dataframes and adding a new index level. Then apply cumsum
for each group and keep only transaction
level:
out = pd.concat([df1.rename(columns={'inventory': 'total'}), df2],
keys=['inventory', 'transaction'])
out = out.assign(total=out.groupby(['location', 'sku']).cumsum()).loc['transaction']
Output:
date location sku total
0 01/02/20 2 2518 2
1 01/03/20 2 2518 0
2 01/04/20 2 2518 8
3 01/05/20 2 2518 5
4 01/02/20 2 667 13
5 01/03/20 2 667 17
6 01/05/20 2 667 15
7 01/06/20 2 667 18
Output after pd.concat
:
>>> out
date location sku total
inventory 0 01/01/20 2 2518 0 # From 1st dataframe
1 01/01/20 105 425 1
2 01/01/20 2 667 1
transaction 0 01/02/20 2 2518 2 # From 2nd dataframe
1 01/03/20 2 2518 -2
2 01/04/20 2 2518 8
3 01/05/20 2 2518 -3
4 01/02/20 2 667 12
5 01/03/20 2 667 4
6 01/05/20 2 667 -2
7 01/06/20 2 667 3