I seek some help to plot a heat map which should look like this:
Sorry in advance, I could have come up with some code, but I have no idea how to begin. I know this is not the way Stackoverflow works, but I am unable to begin.
The data set that one can use along x axis is an array of years from 1975 to 2018 [1975,.....2018]
For y axis: An array of month [January to December]
For x-y intersection values, as shown in image, one can use 1 or 2 or 3
In the image I added, cross signs represent data gaps and white spaces represent zero(0) values.
Thankyou in advance.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
df = pd.read_csv('Events_in_Month_and_Year.xlsx',encoding = 'unicode_escape',error_bad_lines=False
pivoted = df.pivot_table(index='month', columns='year', aggfunc=len, fill_value=0)
pivoted = pivoted.loc[months] # change the order of the rows to be the same as months
for _ in range(20):
# set some random locations to "not filled in"
pivoted.iloc[np.random.randint(0, len(pivoted)), np.random.randint(0, len(pivoted.columns))] = np.nan
max_val = np.nanmax(pivoted.to_numpy())
ax = sns.heatmap(pivoted, cmap=plt.get_cmap('Greys', max_val 1), vmin=-0.5, vmax=max_val 0.5)
ax.patch.set_edgecolor('black') # will be used for hatching
spines = ax.collections[0].colorbar.ax.spines
for s in spines:
spines[s].set_visible(True) # show border around colorbar
I have tried this code. But getting error
Error tokenizing data. C error: Buffer overflow caught - possible malformed input file
My data is stored in a .xlsx file which looks like this
CodePudding user response:
You can use
aPS: If you have your original data e.g. in Excel, you can save them as a csv file and load them with
df = pd.read_csv(filename)
The code for a file similar to the one in the post, could look like the following. To make a difference between 0
and a "data gap", missing data could be represented in the Excel file with an empty cell.
Empty rows for missing years can be added via assigning a new index.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
# read the dataframe from a .csv file
pivoted = pd.read_csv('test.csv', index_col=0) # maybe: delimiter=';'
# extend the index to include all intermediate years
pivoted = pd.DataFrame(pivoted, index=range(pivoted.index.min(), pivoted.index.max() 1))
# exchange columns and rows
pivoted = pivoted.T
max_val = np.nanmax(pivoted.to_numpy())
ax = sns.heatmap(pivoted, cmap=plt.get_cmap('Greys', max_val 1), vmin=-0.5, vmax=max_val 0.5,
cbar_kws={'ticks': np.arange(max_val 1)})
ax.patch.set_edgecolor('black') # will be used for hatching
ax.collections[0].colorbar.outline.set_linewidth(1) # make outline visible