Home > other >  The spark streaming data read kafka
The spark streaming data read kafka

Time:09-23

Val con="10.20.30.91:2181
"Val switchable viewer="topic1"
Val group="group1"
Val numThreads=6
Val SSC=new StreamingContext (sc, Seconds (2))
Val SQC=new SQLContext (sc)
Val topicMap=switchable viewer. The split (", "). The map ((_, numThreads. ToInt)). ToMap
Val lines.=KafkaUtils createStream (SSC, con, group, topicMap). The map (_) _2)
Val showLines=lines. The window (Minutes (60))
ShowLines. ForeachRDD (RDD=& gt; {
Val t=SQC. JsonRDD (RDD)
T.r egisterTempTable (" kafka_test ")
})
SSC. Start ()


This is I wrote about the spark streaming read kafka data application, but when large amount of data, will be closed, I want to realize the function of the concurrent, already achieved real-time data, how to do it? Thank you for the

CodePudding user response:

KafkaUtils. CreateDirectStream
Ordinary createStream method, is to specify an executor of a thread to act as a receiver, single thread consumption data from the Kafka; And DirectStream is each executor of each thread to obtain data on Kafka actively, but the former Spark will help you maintain Consumer offset, which requires you to yourself, online a lot DirectStream document, go and see

CodePudding user response:

Received - 1 the when reading from the channel, the socket has likely had closed

E try to use the method the presence of the error, and this method is not group
Val SSC=new StreamingContext (sc, Seconds (1))
Val topicsSet=Set (" topic1 ")
Val brokers="10.20.30.91:2181
"Val kafkaParams=Map [String, String] (metadata. Broker. "the list" - & gt; Brokers, "serializer class" - & gt; "Kafka. Serializer. StringEncoder")
Val kafkaStream=KafkaUtils. CreateDirectStream [String, String, StringDecoder, StringDecoder] (SSC, kafkaParams topicsSet)

CodePudding user response:

I'm sorry, because of personal ability is limited, can't help you,




  • Related