Home > Software design >  Pandas multiplication of two dataframes with uneven date rows and label columns
Pandas multiplication of two dataframes with uneven date rows and label columns

Time:09-03

I have the following sample of historical Bitcoin rates in US dollars:

BTC_USD_rates = {
    "Open": {"01/01/2022": 46217.5, "02/01/2022": 47738.7, "03/01/2022": 47293.9, "04/01/2022": 46435.7, "05/01/2022": 45833.1, "06/01/2022": 43431.6, "07/01/2022": 43097.9, "08/01/2022": 41551.3},
    "Low": {"01/01/2022": 46217.5, "02/01/2022": 46718.2, "03/01/2022": 45704.0, "04/01/2022": 45602.1, "05/01/2022": 42535.1, "06/01/2022": 42481.1, "07/01/2022": 40810.0, "08/01/2022": 40574.3},
    "High": {"01/01/2022": 47917.6, "02/01/2022": 47944.9, "03/01/2022": 47556.0, "04/01/2022": 47505.4, "05/01/2022": 47019.4, "06/01/2022": 43772.3, "07/01/2022": 43127.7, "08/01/2022": 42304.4},
    "Close": {"01/01/2022": 47738.0, "02/01/2022": 47311.8, "03/01/2022": 46430.2, "04/01/2022": 45837.3, "05/01/2022": 43425.9, "06/01/2022": 43097.5, "07/01/2022": 41546.7, "08/01/2022": 41672.0},
    "Volume": {"01/01/2022": 31239, "02/01/2022": 27020, "03/01/2022": 41062, "04/01/2022": 55589, "05/01/2022": 83744, "06/01/2022": 63076, "07/01/2022": 88358, "08/01/2022": 52544},
}
df1 = pd.DataFrame.from_dict(BTC_USD_rates)
df1

            Open    Low     High    Close   Volume
01/01/2022  46217.5 46217.5 47917.6 47738.0 31239
02/01/2022  47738.7 46718.2 47944.9 47311.8 27020
03/01/2022  47293.9 45704.0 47556.0 46430.2 41062
04/01/2022  46435.7 45602.1 47505.4 45837.3 55589
05/01/2022  45833.1 42535.1 47019.4 43425.9 83744
06/01/2022  43431.6 42481.1 43772.3 43097.5 63076
07/01/2022  43097.9 40810.0 43127.7 41546.7 88358
08/01/2022  41551.3 40574.3 42304.4 41672.0 52544

And then for the same period I have the following historical New Zealand Dollars to $1 US Dollar rates:

USD_NZD_rates = {
    "Open": {"03/01/2022": 1.465, "04/01/2022": 1.4719, "06/01/2022": 1.4717, "07/01/2022": 1.4819},
    "Low": {"03/01/2022": 1.4583, "04/01/2022": 1.4651, "06/01/2022": 1.4708, "07/01/2022": 1.4733},
    "High": {"03/01/2022": 1.4763, "04/01/2022": 1.4784, "06/01/2022": 1.4854, "07/01/2022": 1.4849},
    "Close": {"03/01/2022": 1.4732, "04/01/2022": 1.4669, "06/01/2022": 1.4817, "07/01/2022": 1.4741},
}
df2 = pd.DataFrame.from_dict(USD_NZD_rates)
df2

            Open    Low     High    Close
03/01/2022  1.4650  1.4583  1.4763  1.4732
04/01/2022  1.4719  1.4651  1.4784  1.4669
06/01/2022  1.4717  1.4708  1.4854  1.4817
07/01/2022  1.4819  1.4733  1.4849  1.4741

What I need to accomplish is convert each date's Open, Low, High and Close BTC rates to NZD's using the USD_NZD Close rate for each respective date.

There are two caveats however, and those are preventing me to get there by just going plain vanilla like df1.multiply(df2["Close"], axis="index"):

  1. Ignore the Volume column in df1.
  2. df2 doesn't bring the USD_NZD Close rates I need for some dates (01/01/2022, 02/01/2022, 05/01/2022 and 08/01/2022), so for such cases I need the method to make sure each of the two situations are dealt with accordingly:
    • For those initial missing dates (01/01/2022 and 02/01/2022) the FIRST available date must be used (03/01/2022) as the USD_NZD Close rate that will be used to convert all 4 BTC_USD rates.
    • In case of any missing date down the dataframe (05/01/2022 and 08/01/2022), the PREVIOUS available date must be used (04/01/2022 and 07/01/2022 respectively) as the USD_NZD Close rate that will be used to convert all 4 BTC_USD rates.

How can I get there considering all those exceptions?!

CodePudding user response:

You can try filling in the rates with bfill and ffill like this:

new_rates = df2.reindex(df1.index.union(df2.index))

# open or close?
new_rates['Open'] =new_rates['Open'].bfill()

# fill missing data with previously available data
new_rates['Close'] = new_rates['Close'].ffill()

new_rates = new_rates.bfill(axis=1).ffill(axis=1)

df1.mul(new_rates, fill_value=1)

Output:

                  Close         High          Low         Open   Volume
01/01/2022  69936.17000  70199.28400  67708.63750  67708.63750  31239.0
02/01/2022  69311.78700  70239.27850  68442.16300  69937.19550  27020.0
03/01/2022  68400.97064  70206.92280  66650.14320  69285.56350  41062.0
04/01/2022  67238.73537  70231.98336  66811.63671  68348.70683  55589.0
05/01/2022  63701.45271  68972.75786  62394.73819  67452.57327  83744.0
06/01/2022  63857.56575  65019.37442  62481.20188  63918.28572  63076.0
07/01/2022  61243.99047  64040.32173  60125.37300  63866.77801  88358.0
08/01/2022  61428.69520  62360.91604  59810.57563  61250.77133  52544.0
  • Related