I am new to Weka. I have a dataset. When I try to load the dataset in preprocess phase, I got the following picture
There is a class in my dataset. The dataset predicts cancer. Is it malignant or benign? The blue portion is for malignant and the red portion is for benign.
I would like to know what is the meaning of this histogram? Here some portion is blue, some is red and some is the mixture of red and blue.
Also some number like 189,104,128 is associated with the histrogram.
Can anyone please explain me the graph?
Thank you.
CodePudding user response:
- The
clump
attribute in the breast cancer dataset is numeric (the class, as you stated, is binary). - The values for this attribute have been divided into eight bins.
- The number above each bin represents the number of rows in your dataset that fall within that particular bin.
- The color proportions show how many of the rows belong to what class. As you can see, the smaller the clump the more
benign
rows fall into a bin and the larger, the moremalignant
.