Help please. Can you let me know what's wrong with my format defined?
from sklearn.ensemble import GradientBoostingRegressor
df = pd.read_csv('timeseries.csv')
df.head()
df['timestamp'] = pd.to_datetime(df['timestamp'], format='%Y-%m-%d %H:%M:%S.%f%Z')
X = df['timestamp']
y = df['time_minutes']
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
reg = GradientBoostingRegressor(random_state=0)
reg.fit(X_train, y_train)
GradientBoostingRegressor(random_state=0)
reg.predict(X_test[1:2])
reg.score(X_test, y_test)
ValueError: time data '2021-07-19 11:48:03.357 00' does not match format '%Y-%m-%d %H:%M:%S.%f%Z' (match)
CodePudding user response:
pandas should take care of that easily when you load your csv file, and you don't need to parse it later on :
df = pd.read_csv('timeseries.csv', parse_dates=True)