import org. Apache. Spark. Broadcast. _
The import scala. Collection. The mutable. ArrayBuffer
Import the Java. IO. _
The object SimpleApp has {
Var broadcast1: Broadcast [ArrayBuffer [String]]=_
Val ip_grp_start=ArrayBuffer [String] ()
Def matchword (s: (String, Int)), the List [(String, Int)]={
Val fw=new FileWriter ("/home/hadoop/a.t xt ", true)
Val out=new PrintWriter (fw)
Out.println (" 11111111111111111 "+ broadcast1 value. The length)
Out. The close ()
Return the List (s)
}
Def main (args: Array [String]) {
Val conf=new SparkConf (). SetAppName (" Simple Application ")
Val sc=new SparkContext (conf)
Val line=sc. TextFile (" wordcount. TXT ")
Ip_grp_start +="fsda
"Ip_grp_start +="DSFSF
"
Broadcast1=line. SparkContext. Broadcast (ip_grp_start)
Val words=line. FlatMap (line=& gt; Line. The split (" "))
Val wordpair=words. The map (word=& gt; (word, 1))
Val word=wordpair. FlatMap (x=& gt; Matchword (x))
Val pair=word. ReduceByKey (+ _ _)
Pair. Collect (). The foreach (println)
Sc. Stop ()
}
}
CodePudding user response:
17/02/19 15:07:46 INFO storage. BlockManagerInfo: Added broadcast_2_piece0 memory on 192.168.1.2 instead: in 56023 (size: 2.8 KB, free: 366.3 MB)17/02/19 15:07:47 INFO storage. BlockManagerInfo: Added broadcast_0_piece0 memory on 192.168.1.2 instead: in 56023 (size: 20.4 KB, free: 366.3 MB)
17/02/19 15:07:48 WARN scheduler. TaskSetManager: Lost task in stage 0.0 0.0 (dar 0, 192.168.1.2 instead, executor 0) : Java. Lang. NullPointerException
The at SimpleApp has $. Matchword (SimpleApp has. Scala: 20)
The at SimpleApp has $$anonfun $4. Apply (SimpleApp has. Scala: 38)
The at SimpleApp has $$anonfun $4. Apply (SimpleApp has. Scala: 38)
At the scala. Collection. The Iterator $$$12. -anon nextCur (434) the Iterator. Scala:
At the scala. Collection. The Iterator $$$12. -anon hasNext (440) the Iterator. Scala:
The error the null pointer is wrong quoted above, is actually to the closure function, the value of the variable radio is empty,
CodePudding user response:
This is a test of me to write radio variable function example, don't know why will pass in is empty, if I remove - master parameters at run time, will become a single run, run will not report any errors, and a great god,CodePudding user response:
The reason is that your broadcast var is defined in The class/object level. When The class is initialized in The worker node, it will a. Tony's a NULL, home to The value you assigned in The main method.Change the broadcast scope to the main method. You can pass the value to method.
See the detail example here:
https://github.com/databricks/learning-spark/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/ChapterSixExample.scala#L75
Val signPrefixes=sc. Broadcast (loadCallSignTable ())
Val countryContactCounts=contactCounts. Map {case (sign, count)=& gt;
Val country=lookupInArray (value) sign, signPrefixes.
(country, count)
}. ReduceByKey ((x, y)=& gt; X + y)
See how the signPrefixes broadcast val is defined in the main method, and how its value passed to lookupInArray method.
CodePudding user response:
I met this problem, also haven't found the solution,I estimate that is broadcast variable time is sent to the Executor,
That is also not perform broadcast1=line. SparkContext. Broadcast (ip_grp_start) this code, broadcast1 has been sent to the Executor,
I don't temporary solution is to define the var broadcast1, but val broadcast1=line. SparkContext. Broadcast (ip_grp_start),
Then broadcast1 as matchword function parameter passing over
Def matchword (s: (String, Int) to broadcast1: Broadcast [ArrayBuffer [String]]) : the List [(String, Int)]