Given a value on the first of a month, add values to it for other days of the month-CodePudding

I have one dataframe df1 that looks like this. The date format is M/D/Y.

date	a
1/1/2001	100
2/1/2001	101

And another dataframe df2 that represents outflows and looks like this:

date	a
1/1/2001	2
1/2/2001	3
1/3/2001	-2
1/4/2001	0

Desired output df:

date	a
1/1/2001	100
1/2/2001	98
1/3/2001	101
1/4/2001	99

I need to take the df1 values from the first of every month as the starting value. For each subsequent day of the month, I want to add the values from df2 from the starting value of df1 and place the results in df3.

CodePudding user response：

OK, this does what you ask. Your own data is inconsistent, however. Your text says you ADD the number to the inventory; on the first day, you SUBTRACT the number, but you add the rest.

So, given "x1.csv":

date,inventory
1/1/2001,100
2/1/2001,101
3/1/2001,103

and "x2.csv":

date,update
1/1/2001,2
1/2/2001,3
1/3/2001,-2
1/4/2001,0
2/1/2001,5
2/2/2001,3
2/3/2001,-10
3/1/2001,0
3/2/2001,0

This Python code:

import csv
import pandas as pd

f1 = csv.reader( open('x1.csv'))
f2 = csv.reader( open('x2.csv'))

# Read the first file into a dictionary.

inventory = {}
for row in f1:
    if row[0] != 'date':
        inventory[row[0]] = int(row[1])

# Now process each line in the detail list.

current = 0
rows = []
for row in f2:
    if row[0] == 'date':
        continue
    if row[0] in inventory:
        current = inventory[row[0]]
    rows.append( (row[0],str(current)) )
    current  = int(row[1])

df = pd.DataFrame( rows, columns=['date','level'])
print(df)

produces this output:

       date level
0  1/1/2001   100
1  1/2/2001   102
2  1/3/2001   105
3  1/4/2001   103
4  2/1/2001   101
5  2/2/2001   106
6  2/3/2001   109
7  3/1/2001   103
8  3/2/2001   103

CodePudding user response：

The exact logic is still unclear, but for the global logic of merging, you can use merge on the month period:

df2['a']  = df2.merge(df1,
          left_on=pd.to_datetime(df2['date'], dayfirst=False).dt.to_period('M'),
          right_on=pd.to_datetime(df1['date'], dayfirst=False).dt.to_period('M'),
          how='left', suffixes=('_', '')
          )['a']

output:

       date    a
0  1/1/2001  102
1  1/2/2001  103
2  1/3/2001   98
3  1/4/2001  100

output of the intermediate:

df2.merge(df1,
          left_on=pd.to_datetime(df2['date'], dayfirst=False).dt.to_period('M'),
          right_on=pd.to_datetime(df1['date'], dayfirst=False).dt.to_period('M'),
          how='left', suffixes=('_', '')
          )['a']

0    100
1    100
2    100
3    100
Name: a, dtype: int64