Home > Blockchain >  The Google three papers of substance
The Google three papers of substance

Time:09-24

Read through Google's three papers reading and analysis, a new understanding of my big data analysis, they respectively are GFS and MapRedudce and BigTable code, big data analysis is very complex, simply by using the approach of traditional data analysis can't comprehensive analysis of big data to conduct a comprehensive and reasonable, we need to use advanced data structure and algorithm analysis,
Data structure of the three papers is to big data segmentation, on each data storage and handling, respectively, after summarize the collection and processing, so that it will discuss the computer cannot one-time can't repeat the operation of the problem, into a simple question, and the computer to deal with simple or mechanical repetition efficiency is very high, Google's three papers look the structure is simple, but it does greatly improve the efficiency of data operation, it is also a further summary of computer work efficiency,
GFS storage system consists of master and chunksrever, through the master of the related data to the data segmentation and connection, greatly improving the units chunksrever information transfer between each other, and the data structure is the most complex is the master of management, so that when the master, GFS will appear a backup system for replacement, to ensure the management of the normal operation of the master, and files stored in the Linux file system to store, MapRedudce is the GFS is to use stored data segmentation, MapRedudce consists of Map and redudce, the Map function is similar with the master will GFS mapping, that is, the data will be connected and restore, then the advantage of GFS stored data will show up, the CPU can at the same time for multiple data processing, such as long as the optimization of CPU for hardware, and data processing system optimization, optimization can be processed in the data from various perspectives to achieve better effect,
BigTable is the GFS and MapRedudce further decomposition and refinement, which is an unit with a GFS, with GFS number decomposition method to decompose the data further unitized, make data processing more simple, easy to computer processing,
Through the three theories, data is divided into several orders of magnitude, each layer by the master and Map segmentation and integration, in the progressive step by step and the summary, through this way at the same time for multiple mechanical operation, mechanical and computer operation speed is very fast and accurate, thus the maximum use of the advantages of the computer, and it is a complicated large data analysis into a simple problems by computer processing,
Google also in research and development of other ways to data processing, computer program in a perfect world, under the lead of Google ideas, also emerge some other projects, such as: Apache Drill, Apache Giraph, such as Google's big data in recent years have produced a great impact on the world, I believe that in the future of big data will be have a huge impact on the world of a field,
  • Related