my code:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
df = pd.read_csv('orderlist.csv', skiprows=1, delimiter=';', encoding="utf8")
df.columns = ["date", "customer_number", "item_code", "quantity"]
df['customer_item'] = df.customer_number ', ' df.item_code
df['date'] = pd.to_datetime(df['date'])
df["quantity"] = df["quantity"].astype(int, errors='ignore')
df["week"]=df.date.dt.week
df_grup = df.groupby(by=['week',"customer_item"]).quantity.sum().reset_index()
df_dum = pd.get_dummies(df_grup)
X, y = df_dum, df_dum["quantity"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
dtree = DecisionTreeClassifier().fit(X_train, y_train)
predict = dtree.fit(X_train, y_train)
y_pred = dtree.predict(X_test)
pred_quantity = dtree.predict(df_dum)
print("predict quantity:")
print(pred_quantity)
result:
predict quantity:
[100 5 450 ... 295 22 639]
I need to print customer number next to own result .
CodePudding user response:
the nth item of pred_quantity
corresponds to the nth item in df['customer_number']
so you can either add pred_quantity
as a column to df
df['pred_quantity'] = pred_quantity
print(df[['customer_number', 'pred_quantity']])
or use zip
(docs) to print them side by side
for number, quantity in zip(df['customer_number'], pred_quantity)
print(number, quantity)