Home > other >  SparkStreaming and sparksql integration problems
SparkStreaming and sparksql integration problems

Time:11-22

Prepare to use data from the mysql (around 20 examples, 150 db) real-time synchronization to KUDU, adopt plan:
Canal - & gt; Kafka - & gt; Sparkstreaming + sparksql - & gt; Kudu
Topic currently set to an instance of a topic, batch of the same topic in the interval will exist across the db and table JSON, need to call after detailed explained in sparksql kudu sink preservation,
Want to explained the json string, you need to use a nested RDD, but doesn't seem to support nested RDD, group of friends have met yet? How to deal with,

CodePudding user response:

Hello,
The spark at the time of parsing, can infer data schema information (of course this may be inconsistent with your expectations),
So want to explained the json string without nested RDD
Solved, please listen, if you still cannot solve, please direct messages me,
  • Related