Home > other >  Ask two questions about big data related
Ask two questions about big data related

Time:10-17

1, the name of the Hadoop is a distributed technology architecture, right? The brand name or a specific database (such as ORACLE, MYSQL)?
2, the big data technology architecture of database on the market at present the brand name or what? ElasticSearch is one of them?

Haven't contact with the database, the above two problems have been confused, if convenient, please help me to solve, thank you

CodePudding user response:

From the narrow sense hadoop is a distributed system architecture, including the HDFS, graphs, yarn components,
Broadly speaking the hadoop is an ecosystem, contains most of the big data technology content,

Using big data technology architecture of database Hbase calculate a, because it is at the bottom of the storage with HDFS components, elasticsearch is a distributed search engine, can also be regarded as a server, but it's no use to big data architecture,

CodePudding user response:

Hadoop

reference 1st floor combinations haha xxy response:
from narrow sense hadoop is a distributed system architecture, including the HDFS, graphs, yarn components,
Broadly speaking the hadoop is an ecosystem, contains most of the big data technology content,

Using big data technology architecture of database Hbase calculate a, because it is at the bottom of the storage with HDFS components, elasticsearch is a distributed search engine, can also be regarded as a server, but it's no use to big data architecture,


Thank you very much,
In addition to HBase, there are other?

CodePudding user response:

Fix, elasticsearch above can also be considered to be a database is not a server,
In addition to HBASE, and kudu, druid, such as the specific use depends on your needs, if you are using the data is not only to provide real-time query, and batch operations, kudu support is better, if you need is a very large amount of data query is the HBASE, if filter conditions more, can use HBASE secondary indexes, there are many ways to secondary indexes, the original Mr, 360, huawei hindex, ali, don't know the open source no open source commonly used HBASE HADOOP platform + es, or HBASE + solr, while the druid is higher than the HBASE supports real-time, is said to have reached $per second, comprehensive, there is no best, see demand,
Secondly, if it is used for data analysis, can use the hive, data warehouse is not a database, you want to make sure the relationship between oltp and olap,

CodePudding user response:

combinations reference 3 floor ha ha xxy response:
to fix, elasticsearch above can also be considered a database is not a server,
In addition to HBASE, and kudu, druid, such as the specific use depends on your needs, if you are using the data is not only to provide real-time query, and batch operations, kudu support is better, if you need is a very large amount of data query is the HBASE, if filter conditions more, can use HBASE secondary indexes, there are many ways to secondary indexes, the original Mr, 360, huawei hindex, ali, don't know the open source no open source commonly used HBASE HADOOP platform + es, or HBASE + solr, while the druid is higher than the HBASE supports real-time, is said to have reached $per second, comprehensive, there is no best, see demand,
Secondly, if it is used for data analysis, can use the hive, data warehouse is not a database, you want to make sure the relationship between oltp and olap,


Thank you very much

CodePudding user response:

Overall hadoop ecosystem is a large data, the main support offline data analysis and storage, including the HDFS, zookeeper, hive, yarn, hbase components, such as the upstairs said graphs mainly framework or calculation engine, hadoop version a lot, it have apache hadoop and commercial CDH version, together with other versions, specifically to baidu, before the first entry often build apache version of the native, the company recently had just finished in a week or so time to build the commercial version of the suggested to build a stand-alone version of easy to more clearly understand the hadoop, there are other problems welcome messages or blog

CodePudding user response:

refer to fifth floor qingping の reply:
whole hadoop ecosystem is a large data, the main support offline data analysis and storage, including the HDFS, zookeeper, hive, yarn, hbase components, such as the upstairs said of graphs mainly framework or calculation engine, hadoop version a lot, it have apache hadoop and commercial CDH version, together with other versions, specifically to baidu, often before the first entry structures, the original apache version, the company recently for a week or so to finish building the commercial version of the suggested to build a stand-alone version of easy to more clearly understand the hadoop, there are other problems welcome messages or blog


Thank you!

CodePudding user response:

Hadoop - logo Apache hadoop

The Apache? Hadoop? The project develops the open source software for reliable, scalable, distributed computing.

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

CodePudding user response:

1, the name of the Hadoop is a distributed technology architecture, right? The brand name or a specific database (such as ORACLE, MYSQL)?
Answer: hadoop is a big ecological infrastructure, data can be understood as a set of ecological, there is a zookeeper, hbase, hive, etc all kinds of frame, each frame has its features and functions, such as: they are responsible for the coordination, the registry; Hbase is a kind of database, responsible for the large data storage (told you ORACLE, MYSQL relational database belong to a database). Hive is a tool, etc.,

2, the big data technology architecture of database on the market at present the brand name or what? ElasticSearch is one of them?
A lot of, have their own characteristics, their online search, ElasticSearch is more suitable for search

CodePudding user response:

refer to the eighth floor pezynd response:
1, the name of the Hadoop is a distributed technology architecture, right? The brand name or a specific database (such as ORACLE, MYSQL)?
Answer: hadoop is a big ecological infrastructure, data can be understood as a set of ecological, there is a zookeeper, hbase, hive, etc all kinds of frame, each frame has its features and functions, such as: they are responsible for the coordination, the registry; Hbase is a kind of database, responsible for the large data storage (told you ORACLE, MYSQL relational database belong to a database). Hive is a tool, etc.,

2, the big data technology architecture of database on the market at present the brand name or what? ElasticSearch is one of them?
A lot of, have their own characteristics, their online search, ElasticSearch is more suitable for search


refer to the eighth floor pezynd response:
1, the name of the Hadoop is a distributed technology architecture, right? The brand name or a specific database (such as ORACLE, MYSQL)?
Answer: hadoop is a big ecological infrastructure, data can be understood as a set of ecological, there is a zookeeper, hbase, hive, etc all kinds of frame, each frame has its features and functions, such as: they are responsible for the coordination, the registry; Hbase is a kind of database, responsible for the large data storage (told you ORACLE, MYSQL relational database belong to a database). Hive is a tool, etc.,

2, the big data technology architecture of database on the market at present the brand name or what? ElasticSearch is one of them?
A lot of, have their own characteristics, their online search, ElasticSearch is more suitable for search
strong

CodePudding user response:

1. I think the hadoop belongs to an ecological, containing multiple framework, as the upstairs said, of course, hadoop apache version exists, and CDH version, CDH version can make the framework version can match, there was an error not choose their own version
2. In addition to ES, Hbase, Redis, kudu, and so on is also a database, ES and Logstash at present, used in combination with Kibana is more, is the ELK, of course, you can learn more about the newsql, that sounds like the Microsoft's azure newsql, no specific to understand
  • Related