Due to business needs: the need to be calculated at a fixed point in time when traffic,
Every 5 minutes to calculate a traffic (such as 00:00-00:05 and 00:05 00:10, etc.), is divided into 288 points a day
How to use the spark to complete such a function? My current thinking is that I don't know what better way?
Preliminary ideas: using the structured streaming (variable) to filter out the time specified, other data into database, the next "historical" data read from the real-time flow and database
Val realdata=https://bbs.csdn.net/topics/ds1.filter ($" event_time "& lt; Times)
Val others=ds1. Filter ($" event_time "& gt; Times)
But it faces a dynamic change time problem, so I put the change time monitored structure flow method in
Override def onQueryProgress (queryProgress: QueryProgressEvent) : Unit={
Times=df. The format (df) parse (times). GetTime + 300000)
//read the history database and put it to realdata
}
But it is facing this method repeat problem, this method will run many times in a batch