I have one imputation method to do mean, median and mode operation but this getting failed if column data type is not in Double/Float.
My java code:
Imputer imputer = new Imputer().setInputCol("amount").setOutputCol("amount);
imputer.setStrategy("mean");
ImputerModel model = imputer.fit(dataset);
model.transform(dataset);
Is there any way to handle this
I am using java
CodePudding user response:
I can suggest one way but not sure that it's a best approach or not.
Step-1: get field details this will return StructField[]
Step-2: Iterate through received array and check data type of columns
private boolean isValidColumnTypes(String[] columnArray, Dataset<?> dataset) {
StructField[] fieldArray = dataset.schema().fields();
for (int i = 0; i < columnArray.length; i ) {
for (StructField data : fieldArray) {
boolean doubleType=data.dataType().toString().equals("DoubleType");
boolean floatType=data.dataType().toString().equals("FloatType");
if (columnArray[i].equals(data.name()) && !(doubleType ||floatType)){
return false;
}
}
}
return true;
}
In the above method I am passing column names as String array String[] columnArray