A big XueSeng non-computer professional discuss about Google's big three papers-CodePudding

Google three paper briefly and its analysis:
First we should know that some things, in this era of big data, data often had reached the point of explosive growth, if we when designing computer systems still use the previous theoretical hypothesis: 1. The failure is a group of accident (2) in the usual standards of the management of hundreds of millions of a small file way of KB size 3. Most of these files are modified by overlay data. So Google at this time in order to solve these problems, opened up a Hadoop precedent that Google's three main products: Google FS, graphs, BigTable, and we are through to the three papers read, can know this a few points, Google FS is the core concept of the whole system, and the design concept of Google FS for
1. The system consists of many cheap ordinary components, component failure is the norm, so we need a leader in the system for continuous monitoring of their state, to quickly detect, redundancy and restore failure components, and on their destruction by national elections to choose another leader to continue to monitor the other machines,
2. System to store a certain number of documents, we are expected to be millions of files, the size of the file in 100 MB or more usually, so we're going to make the system can be effective management, and the system must also support small files, but don't need for small files do special optimization, because people don't in the era of big data to view all of the data, so we only need to manage a large amount of data, but don't need a lot of optimization data,
3. The workload of system is mainly composed of two kinds of read operation: large-scale streaming reading and small-scale random reads, the large-scale streaming read a read hundreds of KB of data, usually more common is a read 1 MB or more of the data, from the same client continuous operation is usually read the same file for an area of small random reads is usually a random location in the file to read a few KB of data, (to speaking the truth, I am not to learn the computer, this part I didn't see at all to understand, don't know how to make up, so a simple copy the Google paper),
然后就是MapReduce指导的一种新的大数据计算方法,"分布式计算方法",我如果从纯计算机理论的地方还是不懂,那么反正我就从我的专业金融角度来思考一些区块链的给金融带来的变革吧,区块链在现在最广泛的应用不得不提一下比特币,比特币也是一种虚拟货币,但它和其他的虚拟货币(如Q币)却有着很大的不同,1.比特币的总量固定不变为21000个,2.比特币通过区块链解决了分布式记账和验证的方法(即区块链以去中心化的方式解决了信用问题),而区块链的这一优点(去中心化)也因此可以颠覆金融学的两大基础基石时间与信用中的信用这一概念,因为去中心化,许多金融中介机构赖以生存的信息不平等优势将被去掉,但是长路漫漫,这种理想的状态必然不会到达,就算通过去中心化解决了资料不对等问题,可是大机构还是可以利用大数据的技术从海量的数据中提取到比我们普通人更好的数据,这个社会是人的社会,人有着自己的主观能动性去不断加大自己和他人的差距而不是等着新技术的到来坐以待毙(讲实话,我写到后面发现我在其他地方了解的分布式计算和Google论文中的分布式计算好像不是同一个东西,Google的分布式计算是指一个数据要分开与世界各地来备份,在计算机需要的时候又提取的一种计算方法从而称做分布式计算,但我实在不知道咋写,老师说用金融业的知识来深度研究,所以就偏题凑字数了,,,,,)
And then finally BigTable but I don't understand... As a distributed structured data storage system, it is designed to handle huge amounts of data, usually distributed in the thousands of ordinary petabytes of data on the server, Bigtable has achieved the following goals: applicability widely, scalable, high performance and high availability,,, (don't understand don't understand, the paper gave a copy)

CodePudding user response:

Learning, no contact before,

CodePudding user response:

Thanks for sharing

CodePudding user response:

The building Lord I am undergraduate accounting major. But across the computer test