BIG DATA MASTERY CERTIFICATION GADET149175
Please contact us at contact@gadet.world-evergreen.online to receive the materials and the virtual machine to prepare this certification
Global knowledge to be acquired to pass this certification:
+ Understand Apache Hadoop architecture and data management
+ Dominate data analysis with Hive and Impala
+ Ability to use Flume, Kafka and Sqoop
+ Ability to use Hbase and Kudu
+ Understand Apache Spark architecture and data management
+ Using basic Apache Spark functionality with python:
– Extract Transform Load (ETL) with pyspark
– Spark SQL
– Scalable Data Science
– Machine learning (basic notions) with Mllib et ML
Detailed plan of preparation:
+ Hadoop Architecture and MapReduce
+ Running SQL Statements using Hive Shell, Beeline and Impala Shell
+ Using Beeline and Impala Shell in Non-Interactive Mode
+ Using Hive and Impala in Scripts and Applications
+ Browsing Tables in the Metastore
+ Browsing Files in HDFS
+ Creating Databases and Tables
+ Managing Existing Tables
+ Apache Hive and Apache Impala Interoperability
+ Data and File Types
+Loading Files into HDFS
+ Using Sqoop to Load Data from Relational Databases
+ Using Hive and Impala to Load Data into Tables
- Machine Learning with RDD (MLlib with pyspark)
- Machine Learning with Dataframe (ML with pyspark)