My dataframe looks like this:
title | comments | date |
---|---|---|
post1 | 256 | 2021-07-19 11:48:39 |
post2 | 454 | 2021-07-18 22:14:41 |
post3 | 452 | 2019-05-14 19:38:11 |
post4 | 422 | 2018-06-14 16:38:12 |
post5 | 452 | 2017-03-04 17:18:11 |
I would like to make a line graph with x axis showing the year and y axis showing the amount of the posts posted that year (2 in 2021, 1 in 2019 etc.)
titles_values = df["title"].value_counts().sum()
fig = px.line(data_frame=df, x="time" , y=titles_values)
fig.show()
The error I get is : "Plotly Express cannot process wide-form data with columns of different type."
I am not sure how to go about making it work.
CodePudding user response:
It's a simple case of summarising your data frame. df["date"].dt.year, as_index=False).size()
calculates number of rows per year
import io
import plotly.express as px
import pandas as pd
df = pd.read_csv(io.StringIO("""title,comments,date
post1,256,2021-07-19 11:48:39
post2,454,2021-07-18 22:14:41
post3,452,2019-05-14 19:38:11
post4,422,2018-06-14 16:38:12
post5,452,2017-03-04 17:18:11"""))
df["date"] = pd.to_datetime(df["date"])
px.line(
df.groupby(df["date"].dt.year, as_index=False).size(), x="date", y="size"
).update_layout(xaxis={"type": "category"}, yaxis={"dtick":1,"rangemode":"tozero"})