I have doubt on data interpretation on Weka. The data set on which I worked on as follows
outlook temperature humidity windy play
------------------------------------------------------
sunny hot high FALSE no
sunny hot high TRUE no
overcast hot high FALSE yes
rainy mild high FALSE yes
rainy cool normal FALSE yes
rainy cool normal TRUE no
overcast cool normal TRUE yes
sunny mild high FALSE no
sunny cool normal FALSE yes
rainy mild normal FALSE yes
sunny mild normal TRUE yes
overcast mild high TRUE yes
overcast hot normal FALSE yes
rainy mild high TRUE no
The histograms that has been generated in Weka depending upon the above data set, those histograms are not clear to me.
I know blue color means one can play and red color means one can not play. To draw a histogram we need to find out the frequency of data.
In the above picture, the sunny count is 5 because as per the above dataset outlook attribute has 5 values that are sunny. In the above picture, the overcast count is 4 because as per the above dataset outlook attribute has 4 values that are overcast. If the outlook is overcast one can play, there is no option that one cannot play even if the outlook is overcast. So the overcast bar is pure blue. However, if the outlook is sunny there are 3 possibilities one can play and 2 possibilities one cannot play. Therefore, the bar sunny is a mixture of blue and red.
Now, how could I know by just looking on the bar how many instances are yes if the outlook is sunny and how many instances are no, if the outlook is sunny?
Also, how could I draw such a histogram in excel?
Thank you.
CodePudding user response:
The bar-plot (nominal class) or histogram (numeric class) in Weka's Explorer is only there to give you an idea about the data. At the time of writing, there is no way to tell the various counts for the associated class labels from the graph apart from the total, which is displayed on top.
I don't use Excel, so I can't comment on that.
For reference: the responsible class in Weka's source code for drawing these is weka.gui.AttributeVisualizationPanel
. The inner classes BarCalc
and HistCalc
perform the respective calculations.