Home > database >  Log parsing
Log parsing

Time:09-28

The business scenario
There is some log data, record the acquisition of information, need according to the contents of the log, the data parsing, transaction processing, the great amount of data ()

The data format
The json data,

Data source location
Location: kafka

Data, for example,

Case 1: {" Url ":" item.jd.com/11381983.html ", "EndDate" : "the 2018-04-25 T13:46:50. 345631 + 08:00", "FieldValueDic" : {" IsDeleted ":" False ", "AF1" : "9787543699762", "GoodReviewRate" : "0.929", "UPC" : "NULL"}}

Example 2: {" Url ":" item.jd.com/11381983.html ", "EndDate" : "the 2018-04-25 T20:46:50. 565631 + 08:00", "FieldValueDic" : {" Title ":" in accordance with the law governing the youth reader ", "AF1" : "66666", "Price" : "1"}}



Two data difference is:
Article 1. The first, the data acquisition by IsDeleted AF1, GoodReviewRate, UPC, four fields,
Article 2. The second, the data acquisition in the Title, AF1 (the results of this field have update), Price, three fields,


The goal of data processing
The same url, to real-time update content of each field,


Target data processing results {" Url ":" item.jd.com/11381983.html ", "IsDeleted" : "False", "AF1" : "66666", "GoodReviewRate" : "0.929", "UPC" : "NULL", "Title" : "youth reader" governing the country according to law, "Price" : "1"}


1. Data processing process, how to realize (data parsing, transaction processing)?
2. How the data storage? (the great amount of data)

  • Related