Input one.txt file
[{"a":1,"b":2,"c":3}, {"a":11,"b":12,"c":13},{"a":1,"b":2,"c":3}]
Expected out:
a b c
1,11,1 2,12,1 3,13,3
Could you please provide the solution in a Spark dataFrame using scala?
val spark = SparkSession.builder().appName("JSON_Sample").master("local[1]") getOrCreate()
val data = """[{"a":1,"b":2,"c":3}, {"a":11,"b":12,"c":13},{"a":1,"b":2,"c":3}]""" //one.txt
val df = spark.read.text("./src/main/scala/resources/text/one.txt").toDF()
CodePudding user response:
This is python version of running code with spark. if you are able to convert it then it is fine otherwise let me know i will do it.
df = spark.read.json(sc.parallelize([{"a":1,"b":2,"c":3},{"a":11,"b":12,"c":13},{"a":1,"b":2,"c":3}]))
df.show()
--- --- ---
| a| b| c|
--- --- ---
| 1| 2| 3|
| 11| 12| 13|
| 1| 2| 3|
--- --- ---
df.agg(*[concat_ws(",",collect_list(col(i))).alias(i) for i in df.columns]).show()
------ ------ ------
| a| b| c|
------ ------ ------
|1,11,1|2,12,2|3,13,3|
------ ------ ------
For scala Spark :
import spark.implicits._
val spark = SparkSession.builder().appName("JSON_Sample").master("local[1]") getOrCreate()
val jsonStr = """[{"a":1,"b":2,"c":3}, {"a":11,"b":12,"c":13},{"a":1,"b":2,"c":3}]"""
val df= spark.read.json(spark.createDataset(jsonStr :: Nil))
val exprs = df.columns.map((_ -> "collect_list")).toMap df.agg(exprs).show()