Home > Back-end >  plotly express heatmap using pandas dataframe
plotly express heatmap using pandas dataframe

Time:03-21

I am doing some volunteer work for a South American NGO that maps criminality in the continent. It is my first time using pandas and I would be most grateful if you could help me with the following scenario.

I have the following DataFrame:

df = pd.DataFrame({'country': ['Brazil', 'Argentina', 'Uruguay', 'Brazil', 'Argentina', 'Uruguay', 'Brazil', 'Argentina', 'Uruguay'],
                   'year': [2015, 2015, 2015, 2016, 2016, 2016, 2017, 2017, 2016],
                   'occurrence': ['murder', 'murder', 'rape', 'rape', 'arson', 'arson', 'murder', 'murder', 'arson']})

I would like to sum the crime occurrences per country per year and display the results using a plotly express heatmap in which the years would be mapped to the x axis, the countries to the y axis and the colours would show the number of occurrences.

Do you think you could help me transform the data and feed it into the appropriate plotly express heatmap function?

Thank you!

CodePudding user response:

You should first group your data by country and then by year and count number of crimes:

new_df = df.groupby(["country","year"])["occurrence"].count().reset_index()

enter image description here

After that, you should change the structure of the data by pivoting the data according to your needs:

new_df = new_df.pivot(index='country', columns='year')['occurrence'].fillna(0)

enter image description here

Now, you can plot your heatmap:

import plotly.express as px
fig = px.imshow(new_df, x=new_df.columns, y=new_df.index)
fig.update_layout(width=500,height=500)
fig.show()

enter image description here

  • Related