I am working on a project. in which, i want to write spark dataframe data to a CSV file in MinIO bucket. I have searched everywhere but didn't get any proper solution. Please help me to achieve this.
I have tried many solutions but not worked
CodePudding user response:
You can use below code:
String bucketName="my_bucket";
String filePath="files/my_file.csv";
List<Row> rowList = new ArrayList<>();
rowList = dataFrame.collectAsList();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
String[] fieldNames = rowList.get(0).schema().fieldNames();
String headers = String.join(delimiter, fieldNames) "\n";
baos.write(headers.getBytes());
rowList.stream().forEach(row -> {
String[] arr = new String[row.length()];
for (int i = 0; i < row.length(); i ) {
arr[i] = getData(row.getAs(i));
}
String str = String.join(delimiter, arr) "\n";
try {
baos.write(str.getBytes());
} catch (IOException e) {
e.printStackTrace();
}
});
ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
MinioClient minioClient = // get your minio client here
minioClient.putObject(PutObjectArgs.builder().bucket(bucketName).object(filePath).stream(bais, bais.available(), -1).build());
bais.close();
baos.close();