Consult the last column "spark how to intercept the CSV file and save"-CodePudding

My code is
Scala> Val r1=sc. TextFile (" D:/item. CSV "). The map (x=& gt; X.s plit (", ")). The map (x=& gt; The List (
X (0), x (1), x (2), x (3), x (4), x (5)). ToArray). MapPartitions {x=& gt; Val stringWriter=ne
W StringWriter (); Val csvWriter=new csvWriter (stringWriter); CsvWriter. WriteAl
L (x.t oList); The Iterator stringWriter. ToString ()}. SaveAsTextFile (" D:/result/r1)"

CodePudding user response:

Supplement:
1, this is a Windows
2, what can be stored directly into a CSV file?

CodePudding user response:

Their top again

CodePudding user response:

You should use SparkSQL SQLContext read CSV (2.0 + SparkSession), then use SQL select the columns you want, and then write the CSV

CodePudding user response:

Val sspbidSchema=StructType (Array (
StructField (" req_id StringType, true),
StructField (" creative_id StringType, true),
StructField (" group_id ", StringType, true),
StructField (" user_ip ", StringType, true)))
Val df=spark. Read. Schema (sspbidSchema). CSV (" DDD "). The select (" req_id "). Write. CSV (" DDD ")

What I use is 2.0

CodePudding user response:

I tried, it can only extract continuous previous columns, can extract discontinuous column?