The Spark zero introductory online video training series entrance-CodePudding

Spark is the most popular open source computing framework, big data memory, is realized by using the Scala language developed by UC Berkeley university AMPLab laboratory (2009) and open source in 2010, a top of the Apache foundation project in 2014, 2014 and 2015, the Spark has experienced rapid development, Databricks Spark 2015 [1] : according to the report in September 2014 to September 2015, already has more than 600 Spark source contributor, the number of 12 months before that there are only 315, Spark beyond Hadoop, undisputed to become one of the most active in the field of large data of open source projects, in addition to this, already has more than 200 companies serving the Spark source, make the Spark community so far most developers to participate in the community,

The goal of this issue is lead to big data processing of the IT staff to carry on the introduction to Spark study, the main contents include the Scala language core syntax, Spark cluster structures, and the development environment to build, the Spark programming model, SQL, Spark Spark Streaming, Spark MLlib and Spark Graphx, through this issue, can Spark using Scala application development, master the basic operation principle of Spark and programming model, to be familiar with the use of Spark SQL for the development of data warehouse, to grasp the Spark flow calculation, the Spark machine learning and the principle of figure calculation,

training guests: Zhou Zhihu, greentown group data center database engineer/administrator,

Guest is introduced: computer software and theory of master graduate student at the university of electronic science and technology, research interests include computer vision, machine learning, successively worked in ningbo bank, after graduation committee of CPC zhejiang provincial committee party school, currently employed in greentown group, as a data center platform architect, data development director, Scala language, Hadoop and Spark big data processing technology enthusiasts,

subject outline (section has been completed to purple, the entry will link to each issue later, expected time release 12.14)

Scala article

1. The basic data types and program control structure, Scala Scala basic data types, program control structure of them, especially the use of a for loop in detail,
2. The Scala set operations, to an Array, List, Map and other important data structure operations,
3. The Scala function, the content including function literals, function closures, higher-order functions, partial function and set common higher-order functions, etc.,
4. Scala classes and objects, this paper introduces the Scala object-oriented programming, including the definition of a class, associated objects, associated class, the application object, such as class inheritance,
5. Scala pattern matching, including the effect of pattern matching, the type of pattern matching, pattern matching in the for loop, a regular expression, the application of pattern matching and Case Class,
6. The Scala type parameters, types of the variables in the Scala, covariant and inverter are introduced, such as
7. Scala implicit conversion, including implicit conversion function, implicit class, implicit objects, views on the definition and the text defined the implicit conversion, etc.,
8. Scala senior types, in Scala singleton type, introduces the abstract type and so on,

Spark article

1. The Spark cluster deployment and development environment, the content including the Hadoop cluster, Spark cluster structures, Intellij IDEA Spark development environment to build, the Spark the use of Shell, etc.,
2. Spark operation principle, the content including the Spark script file parsing, Spark several different operation modes, RDD principle, narrow and wide dependence dependence, the Spark of task scheduling, etc.,
3. The Spark programming model, this paper introduces the Spark programming model, the transformation of commonly used and introduces the action operation,
4. Spark DataFrame and SQL, SQL and introduce the Spark DataFrame operation principle and method of use, use case to introduce the use of Spark SQL,
5. The Spark flow calculation, this paper introduces DStream, Spark Streaming principle, and its use is illustrated by several cases,
6. Spark machine learning, this paper introduces the Spark MLlib architecture, through the K - Mean algorithm and random forest algorithm that Spark MLlib use,
7. The Spark Graphx, Spark diagram calculation and related data structure, PageRank algorithm is used to demonstrate its use,

advanced courses - TBD...

CodePudding user response:

Hope to continue to update later, of course, had better not to collect fees,

The Internet, the pursuit of open source, sharing

CodePudding user response:

A praise for your point,

CodePudding user response:

Died, how still didn't update,

CodePudding user response:

I'm sorry, there was a time not updated, directly to the uniform application interface can http://edu.csdn.net/huiyiCourse/detail/88,

CodePudding user response:

Good good study ~ ~ ~ ~ ~

CodePudding user response:

66666666666666

CodePudding user response:

Good, to learn

CodePudding user response:

Thank you for your sharing, is resource, I want to study hard, thank you