I am trying to read data from an online database and convert the first row to a timestamp. Although using the right format string in to_datetime, but I got the error
time data '\r' does not match format '%Y%m%d:%H%M' (match)
Here is the code:
import requests
import pandas as pd
import csv
from datetime import datetime
lat= 39 #Latitude
lon=-4 #Longitude
mountingplace = "building"
angle=15
aspect = 0
url = "https://re.jrc.ec.europa.eu/api/seriescalc?lat={}&lon={}&outputformat=csv&browser=1&startyear=2015&endyear=2016&peakpower=1&mountingplace={}&angle={}&pvcalculation=1&aspect={}&loss=5".format(lat,lon,mountingplace,angle, aspect)
pv=pd.read_csv(url, delimiter=',', lineterminator='\n', skiprows=10)#, date_parser=custom_date_parser)
pv['time']=pd.to_datetime(pv.time, format="%Y%m%d:%H%M")
CodePudding user response:
pv=pd.read_csv(url, delimiter=',', lineterminator='\n', skiprows=10)#, date_parser=custom_date_parser)
Here you are assuming file ending is newline (\n
), I inspect file and it seem to be using carriagereturn-newline (\r\n
) endings. Just let read_csv
to act automatically there, i.e. do
pv=pd.read_csv(url, delimiter=',', skiprows=10)#, date_parser=custom_date_parser)
Also .time
have some data not compliant with %Y%m%d:%H%M
format namely
17546 H_sun: Sun height (degree)
17547 T2m: 2-m air temperature (degree Celsius)
17548 WS10m: 10-m total wind speed (m/s)
17549 Int: 1 means solar radiation values are recons...
17550 PVGIS (c) European Union
So you need to inform pandas.to_datetime
what do with it unless fatal malfunction is desired behavior, you might do e.g.
pv['time_parsed']=pd.to_datetime(pv.time, format="%Y%m%d:%H%M", errors="coerce")
then non-compliant values will be turned into NaT
s.