Home > database >  Error locating Pandas index using Timestamp
Error locating Pandas index using Timestamp

Time:12-14

I have a Pandas dataframe that looks something like this:

                       a           b         c  ...            x          y       z
date                                            ...                                
2043-10-01  10230.413086  846.184082  0.267180  ...  2771.997314  20.699804  4000.0
2043-11-01  10229.154297  841.288513  0.267003  ...  2770.365723  20.749172  4000.0
2043-12-01  10231.440430  836.821472  0.266981  ...  2769.230469  20.797396  4000.0
2044-01-01  10237.501953  832.406677  0.267381  ...  2768.310547  20.849573  4000.0
2044-02-01  10233.545898  827.571655  0.266966  ...  2766.528564  20.897126  4000.0
2044-03-01  10235.044922  823.357910  0.266938  ...  2765.628906  20.942534  4000.0
2044-04-01  10243.462891  819.170654  0.267569  ...  2765.451172  20.993223  4000.0
2044-05-01  10236.799805  814.516602  0.266984  ...  2763.450684  21.038358  4000.0
2044-06-01  10240.304688  810.241150  0.266869  ...  2762.673828  21.087164  4000.0
2044-07-01  10259.951172  806.501587  0.267803  ...  2764.588135  21.142576  4000.0

I want to extract the values at dates defined using a Pandas date_range eg:

import pandas as pd
for xdat in pd.date_range(start="2040/01/01", end="2044/07/01", freq="MS"):
    x = df[xdat]['x']

However, I get this error KeyError: Timestamp('2040-01-01 00:00:00'). I have tried converting the Timestamp variable xdat using pd.to_datetime (and variations of this) but so far without success. I'm sure the answer is trivial but I can't see it so would appreciate any suggestions. Thanks in advance!

CodePudding user response:

Convert the "date" column to to_datetime and access fields using loc:

df = pd.DataFrame(data=[["2043-10-01",10230.413086,846.184082,0.267180],["2043-11-01",10229.154297,841.288513,0.267003],["2043-12-01",10231.440430,836.821472,0.266981],["2044-01-01",10237.501953,832.406677,0.267381],["2044-02-01",10233.545898,827.571655,0.266966],["2044-03-01",10235.044922,823.357910,0.266938],["2044-04-01",10243.462891,819.170654,0.267569],["2044-05-01",10236.799805,814.516602,0.266984],["2044-06-01",10240.304688,810.241150,0.266869],["2044-07-01",10259.951172,806.501587,0.267803]], columns=["date","a","b","c"])
df["date"] = df["date"].apply(pd.to_datetime)
df = df.set_index("date")

for xdat in pd.date_range(start="2044/01/01", end="2044/07/01", freq="MS"):
    df.loc[xdat, "a"]

# Or filter by date index
df = df.loc[pd.date_range(start="2044/01/01", end="2044/07/01", freq="MS"), "a"]

                       a  
date
2044-01-01    10237.501953
2044-02-01    10233.545898
2044-03-01    10235.044922
2044-04-01    10243.462891
2044-05-01    10236.799805
2044-06-01    10240.304688
2044-07-01    10259.951172
  • Related