Big Data Hadoop


This Big Data Hadoop and Spark Developer Training is a comprehensive Hadoop Big Data training course designed by industry experts considering current industry job requirements to provide in-depth learning on big data and Hadoop Modules. This course will help students learn and master how the components of the Hadoop ecosystem, such as Hadoop 2.7, Yarn, MapReduce, HDFS, Pig, Impala, HBase, Flume, Apache Spark, etc. fit in with the Big Data processing lifecycle.

It will be instructor-led online live training.

System prerequisites:

8 GB RAM, Windows/Linux OS

Duration: 50 hours


Course Content:

1. Introduction to Hadoop
            What is Hadoop
            History of Hadoop
            Comparison of Traditional Large-Scale Systems and Need for Hadoop
            Understanding Hadoop Architecture
            Fundamental of HDFS (Blocks, Name Node, Data Node, Secondary Name Node)
            Rack Awareness
            Read/Write from HDFS
            HDFS Federation and High Availability

2. Starting Hadoop
            Setting up Cluster with Environment
            Understanding Hadoop configuration files
            Hadoop Components- HDFS, MapReduce
            Overview Of Hadoop Processes
            Overview Of Hadoop Distributed File System
            The building blocks of Hadoop
            Hands-On Exercise: Using HDFS commands

3. MapReduce-2(YARN)
            YARN Architecture
            Application Master, Node Manager and Resource Manager
            Execution of standard MR job
            Job Submission and Job Initialization
            Task Assignment and Task Execution
            Progress and Monitoring of the Job
            Failure Handling in YARN
            Task Failure
            Hands on exercises

4. Hive, Impala & PIG
            Introduction to Apache Hive
            Architecture of Hive
            Hive data types
            Hive-HQL
            Types of Tables in Hive
            Partitions
            Buckets & Sampling
            Indexes
            Views
            Executing hive queries from Linux terminal
            Executing hive queries from a file
            Creating UDFs in HIVE
            PIG concepts and Hands on exercises
            Hands-On Exercise

5. Sqoop
            Introduction to SQOOP& Architecture
            Import data from RDBMS to HDFS
            Importing Data from RDBMS to HIVE
            Exporting data from HIVE to RDBMS
            Hands on exercise

6. HBASE or NOSQL Databases
            Introduction to HBASE
            Exploring HBASE Master & Region server
            Exploring Zookeeper
            CRUD Operation of HBase with Examples
            HIVE integration with HBASE
            Hands on exercise on HBASE

7. SPARK & OOZIE
            Oozie framework
            Spark framework
            Writing Spark Applications using Scala
            Data Frames and Spark SQL

8. Hadoop Administration Basics

9. Projects:
    3 E2E projects in different domains will be covered as part of this course

  • Name
  • E-mail
  • Course
  • Phone number
  • Subject
  • Message