Home > OS >  Date extraction without "T00:00:00" and formated as %d/%m/%Y
Date extraction without "T00:00:00" and formated as %d/%m/%Y

Time:09-29

I have been working on a huge text file. Where I want to read and cut it with pandas.

Here is a sample of the raw file:

Date;Time;GHI;DNI;DIF;flagR;SE;SA;TEMP;AP;RH;WS;WD;PWAT
01.01.1994;00:07;0;0;0;0;-41.92;-19.43;14.3;1004.4;93.4;0.3;189;17.7
01.01.1994;00:22;0;0;0;0;-40.65;-23.70;14.3;1004.4;93.6;0.1;186;17.8
01.01.1994;00:37;0;0;0;0;-39.14;-27.75;14.3;1004.3;93.7;0.0;10;18.0

To do that, I have a date format %d.%m.%Y, and I changed it into %d/%m/%Y. Then I saw on the VSCode Data Viewer the need to sort because my result was %Y-%m-%d time. This time part is always T00:00:00, and I do not need it because I already have time. Why is this text appearing in VSCode Data Viewer? Does this time is always generated? Is it ignored by Python? Why the date format I wrote is not working?

import pandas as pd
import numpy as np
import datetime

# It will read the file: It will separate by semi-colonne,
# and it will ignore the first 56 rows.
file = pd.read_csv('file.txt', 
                    sep = ';', 
                    skiprows = 56)

# It will read the "Date" column to replace the "."
# to "/". This will help the code to read properly the
# date column. Then it will give the format to the 
# whole column [day/month/year]. 

file["Date"] = file["Date"].str.replace('.','/').apply(lambda x: datetime.datetime.strptime(x, "%d/%m/%Y").date())

I used the code snippet above but it doesn't work with the format %d/%m/%Y and .date().

This is the file contents when I print it:

             Date   Time  GHI  DNI  DIF  flagR     SE     SA  TEMP      AP    RH   WS   WD  PWAT
    0  1994-01-01  00:07    0    0    0      0 -41.92 -19.43  14.3  1004.4  93.4  0.3  189  17.7
    1  1994-01-01  00:22    0    0    0      0 -40.65 -23.70  14.3  1004.4  93.6  0.1  186  17.8
    2  1994-01-01  00:37    0    0    0      0 -39.14 -27.75  14.3  1004.3  93.7  0.0   10  18.0

This is the file contents when I look it using VSCode Data Viewer:

                      Date   Time  GHI  DNI  DIF  flagR     SE     SA  TEMP      AP    RH   WS   WD  PWAT
    0  1994-01-01T00:00:00  00:07    0    0    0      0 -41.92 -19.43  14.3  1004.4  93.4  0.3  189  17.7
    1  1994-01-01T00:00:00  00:22    0    0    0      0 -40.65 -23.70  14.3  1004.4  93.6  0.1  186  17.8
    2  1994-01-01T00:00:00  00:37    0    0    0      0 -39.14 -27.75  14.3  1004.3  93.7  0.0   10  18.0

Thank you

CodePudding user response:

That's how VScode Data Viewer views date, it doesn't mean it's this way actually.

So, you can change the format of your Date column by replacing it with this:

file["Date"] =  pd.to_datetime(file['Date'], format='%d.%M.%Y').dt.strftime('%d/%m/%Y')

# write dataframe to CSV file
file.to_csv("out.csv", index=False)

And this is the content of the CSV file:

Date Time GHI DNI DIF flagR SE SA TEMP AP RH WS WD PWAT
01/01/1994 00:07 0 0 0 0 -41.92 -19.43 14.3 1004.4 93.4 0.3 189 17.7
01/01/1994 00:22 0 0 0 0 -40.65 -23.7 14.3 1004.4 93.6 0.1 186 17.8
01/01/1994 00:37 0 0 0 0 -39.14 -27.75 14.3 1004.3 93.7 0.0 10 18.0
  • Related