Pandas to_records() timestamp to date-CodePudding

I have the following data frame object

              total
scanned_date       
2021-11-01        0
2021-11-02        0
2021-11-03        0
2021-11-04        0
2021-11-05        0

Where scanned_date is Timestamp object. I want to convert the data to a list of tuples like

[
  (2021-11-01, 0),
  (2021-11-02, 0),
  (2021-11-03, 0),
  ...
]

But when using

list(df.to_records())

It is adding timezone, while I only want the date string

[('2021-11-01T00:00:00.000000000', 0), ('2021-11-02T00:00:00.000000000', 0), ('2021-11-03T00:00:00.000000000', 0)]

How can I remove the timezone string T00:00:00.00000000 from the to_records() output?

CodePudding user response：

Try convert strftime

df.index = df.index.strftime('%Y-%m-%d')
list(df.to_records())
Out[212]: 
[('2021-11-01', 0),
 ('2021-11-02', 0),
 ('2021-11-03', 0),
 ('2021-11-04', 0),
 ('2021-11-05', 0)]

CodePudding user response：

I tried to do the date conversion in numpy but chose to switch to pandas. In numpy your working with a 64 bit integer. I used a map function and a lambda to convert the dataframe record into a date and value tuple

txt="""scanned_date,total       
2021-11-01,0
2021-11-02,0
2021-11-03,0
2021-11-04,0
2021-11-05,0
"""

#https://www.py4u.net/discuss/17020

df = pd.read_csv(io.StringIO(txt),sep=',',parse_dates=['scanned_date'])
print(list(map(lambda tuple_obj: 
               (
                   pd.to_datetime(tuple_obj[1],'%M/%d/%Y')
                  #str(tuple_obj[1].astype("datetime64[M]").astype(int)% 12   1)
                  #   "-"   str(tuple_obj[1].astype(object).day)
                  #   "-"   str(tuple_obj[1].astype("datetime64[Y]"))
                 ,
                tuple_obj[2]),
               df.to_records())))

output:

[(Timestamp('2021-11-01 00:00:00'), 0), (Timestamp('2021-11-02 00:00:00'), 0), (Timestamp('2021-11-03 00:00:00'), 0), (Timestamp('2021-11-04 00:00:00'), 0), (Timestamp('2021-11-05 00:00:00'), 0)]