Home > front end >  Given a value on the first of a month, add values to it for other days of the month
Given a value on the first of a month, add values to it for other days of the month

Time:04-27

I have one dataframe df1 that looks like this. The date format is M/D/Y.

date a
1/1/2001 100
2/1/2001 101

And another dataframe df2 that represents outflows and looks like this:

date a
1/1/2001 2
1/2/2001 3
1/3/2001 -2
1/4/2001 0

Desired output df:

date a
1/1/2001 100
1/2/2001 98
1/3/2001 101
1/4/2001 99

I need to take the df1 values from the first of every month as the starting value. For each subsequent day of the month, I want to add the values from df2 from the starting value of df1 and place the results in df3.

CodePudding user response:

OK, this does what you ask. Your own data is inconsistent, however. Your text says you ADD the number to the inventory; on the first day, you SUBTRACT the number, but you add the rest.

So, given "x1.csv":

date,inventory
1/1/2001,100
2/1/2001,101
3/1/2001,103

and "x2.csv":

date,update
1/1/2001,2
1/2/2001,3
1/3/2001,-2
1/4/2001,0
2/1/2001,5
2/2/2001,3
2/3/2001,-10
3/1/2001,0
3/2/2001,0

This Python code:

import csv
import pandas as pd

f1 = csv.reader( open('x1.csv'))
f2 = csv.reader( open('x2.csv'))

# Read the first file into a dictionary.

inventory = {}
for row in f1:
    if row[0] != 'date':
        inventory[row[0]] = int(row[1])

# Now process each line in the detail list.

current = 0
rows = []
for row in f2:
    if row[0] == 'date':
        continue
    if row[0] in inventory:
        current = inventory[row[0]]
    rows.append( (row[0],str(current)) )
    current  = int(row[1])

df = pd.DataFrame( rows, columns=['date','level'])
print(df)

produces this output:

       date level
0  1/1/2001   100
1  1/2/2001   102
2  1/3/2001   105
3  1/4/2001   103
4  2/1/2001   101
5  2/2/2001   106
6  2/3/2001   109
7  3/1/2001   103
8  3/2/2001   103

CodePudding user response:

The exact logic is still unclear, but for the global logic of merging, you can use merge on the month period:

df2['a']  = df2.merge(df1,
          left_on=pd.to_datetime(df2['date'], dayfirst=False).dt.to_period('M'),
          right_on=pd.to_datetime(df1['date'], dayfirst=False).dt.to_period('M'),
          how='left', suffixes=('_', '')
          )['a']

output:

       date    a
0  1/1/2001  102
1  1/2/2001  103
2  1/3/2001   98
3  1/4/2001  100

output of the intermediate:

df2.merge(df1,
          left_on=pd.to_datetime(df2['date'], dayfirst=False).dt.to_period('M'),
          right_on=pd.to_datetime(df1['date'], dayfirst=False).dt.to_period('M'),
          how='left', suffixes=('_', '')
          )['a']

0    100
1    100
2    100
3    100
Name: a, dtype: int64
  • Related