I have a strange dataframe, that doesn't seem to operate in the way I expect. I should have a column heading that I can use.
The code I have produces the following, which is supposed to be used for a histogram.
categories = pd.Series(df['category'])
category_freq = pd.Series(df[df['engine'] == 'u']['category'])
hist = pd.crosstab(category_freq, categories)
counts = pd.DataFrame(np.diag(hist), index=[hist.index])
But the output has a '0'
at the very top. I cannot seem to get things behaving as I would want. For example the output looks like the following:
0
category
baby 65
beauty 73
christmas 168
If I access via counts[0]
, I can remove this "top layer", but I can never find a way to access rows via say counts[0]['category']
. I get key not found. How can I get the data in a format that works as DataFrame?
CodePudding user response:
Make a Series
out of it instead:
counts = pd.Series(np.diag(hist), index=[hist.index])