Do you have any questions about the selection of data sets, I was doing a binary classification problem, labeled 1 data sets about three thousand, but the labels of 0 data sets, there are hundreds of thousands of, so I want to ask how should choose data set
Algorithm is chosen GBDT and XGBoost
Article three thousand the labeled 1 less amount of data to calculate?
If have modest, should how to label 1 data, and how to take data label of 0 is appropriate?
If not less, the label of 0 data should take how much? Or are all made of?