Hadoop Administration in
collaboration with IBM

Apply Now
Industry Experts From

Universally Recognized Certificates

From IBM and DataTrained

Capstone and Real Life Projects

Access to 15 real life projects and a capstone project

Analytics Jobs Placement Assistance

Access to analyticsjobs.in curated jobs

Access to in-demand Tools

IBM Watson labs and $1200 equivalent Cloud Credits

₹9,999 ($175)

for self-paced

₹14,999 ($250)

for self-paced and live sessions blended mode

About the Program

  • Introduction & Moving Data into Hadoop

  • Controlling Hadoop Jobs Using Oozie

  • Developing Distributed Applications Using ZooKeeper

  • Solr

  • Complete Hadoop Administration Learning path

Download Syllabus

Prerequisite

This Course is for those who are interested in becoming familiar with the concept of big data.

IBM is an American multinational information technology company headquartered in Armonk, New York, with operations in over 170 countries. IBM is one of the world's largest employers, with over 350,000 employees, known as "IBMers". At least 70% of IBMers are based outside the United States, and the country with the largest number of IBMers is India. IBM employees have been awarded five Nobel Prizes, six Turing Awards, ten National Medals of Technology (USA) and five National Medals of Science (USA).

This collaboration between IBM and DataTrained provide our student's hands-on experience in predictive analytics and advanced computing.

Expectations for this program co-developed with IBM:

1. Industry-recognized certificate from IBM and DataTrained.
2. IBM Cloud Credits for 6 months equivalent to $1200.
3. IBM Cloud Platforms access like IBM Watson for hands-on practice.
This learning path is designed to keep you ahead of the game, providing the skills you need to interact, manipulate and trouble shoot, all the while keeping your cool.
This Course is for those who are interested in becoming familiar with the concept of big data to answer these types of questions. This path introduces the Big Data Concept and its infinite possibilities, then it looks at using familiar spreadsheet environments to start asking questions, and then moves to introducing tools sets, specifically Apache Spark for effective and timely processing to get you the answers you need.

Programming Languages and Tools Covered

Instructors

Learn from India’s leading Software Engineering faculty and Industry leaders

Learning Path

1
Course 1: Moving Data into Hadoop

. Learn about the different options for importing or loading data into HDFS from common data sources such as relational databases, data warehouses, web server logs, etc. (Apache Sqoop and Flume are covered in greater detail.)

.Learn how to import/export data in and out of Hadoop from sources like databases.

.Learn how to move data with Data Click for IBM BigInsights.


Module 1 - Load Scenarios
· Understand how to load data at rest, in motion
· Understand how to load data from common data sources e.g. RDBMS

Module 2 - Using Sqoop
· Import data from a relational database table into HDFS
· Use Sqoop import and export command

Module 3 - Flume Overview
· Describe Flume and its uses
· How Flume works

Module 4 - Using Flume
· List the Flume configuration components
· Describe how to start and configure a Flume agent

Module 5 - Using Data Click
· Describe Data Click for BigInsights
· List the major components of Data Click

Course 2: Controlling Hadoop Jobs Using Oozie

. See the components required to code a workflow as well as optional components such as case statements, forks, and joins.

. Learn how to use the Oozie coordinator to schedule a workflow. You will quickly notice that workflows are coded using XML which tends to get verbose.

. Learn about a graphical workflow editor tool designed to simplify the work in generating a workflow.


Module 1 - Introduction to Oozie Workflows
. Explain the use for Oozie workflows
. Describe a workflow
. List some of the workflow elements

Module 2 - Oozie Coordinator
. Explain the use for the Oozie coordinator
. List some of the coordinator elements
. Describe how to submit a workflow job and a coordinator job

Module 3 - BigInsights Workflow Editor
. Explain how to publish an application . Describe how to define a reoccurring schedule for an application
. Explain how to link multiple applications to form a new application
2

3
Course 3: Developing Distributed Applications Using ZooKeeper

Building distributed applications comes with challenges that are intrinsic to distributed applications itself, which includes maintaining configuration information, groups, naming, and synchronization.

Module 1 - Introduction to ZooKeeper

. Describe distributed systems and the purpose of Zookeeper
. Describe the ZooKeeper consistency guarantees
. Describe the basics of Zookeeper components
. Describe the application of Zookeeper in Hadoop ecosystem and usage in other real-world scenarios.

Module 2 - The ZooKeeper Data Model

. Understand ZooKeeper Components in detail
. Use ZooKeeper CLI to run commands and interact with ZooKeeper service

Module 3 - Programming and Advanced Topics

Manage ZooKeeper’s ACL and authentication to control permissions to the znodes
Handle the various failure modes of ZooKeeper
List the various ZooKeeper bindings and API
Use the Java API to create a ZooKeeper application
Use various ZooKeeper clients to work with ZooKeeper
Understand how ZooKeeper works with ZooKeeper Atomic Broadcast (zab)
Maintain your ZooKeeper environment with ZooKeeper administration

Course 4: Solr

Learn the basics of Solr (pronounced "solar"), an open source enterprise search platform, written in Java, from the Apache Lucene project. Solr is a standalone full-text search server that uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it usable from most popular programming languages.

Module 1 - Search Engines

. Understand the importance of text search engines
. Understand the Solr search procedure
. Identify Solr components

Module 2 - Configure and Add Documents to Solr

. Identifying the important files in a Solr installation
. Define the schema for documents in the index
. Understand the various ways to add documents to Solr

Module 3 - Analyzers and Queries

. Use analyzers, tokenizers, and filters
. Construct queries

Module 4 - SolrJ and Customization

. Create SolrJ applications
. Understand the customization options available in Solr
4

5
Course 5: Complete Hadoop Administration Learning path

Applied Data Science with Python in collaboration with IBM

Get eligible for 3 world-class certifications thus adding that extra edge to your resume.
  • Alumni Status
  • Learning paths and certification from IBM
  • Course completion certificate from DataTrained Education
  • Project completion certificate from DataTrained Education

Admission Process

There are 3 simple steps in the Admission Process which is detailed below
Step 1: Fill in a Query Form
Fill up the Query Form and one of our counselor will call you & understand your eligibility.
Step 2: Get Shortlisted & Receive a Call
Our Admissions Committee will review your profile. Upon qualifying, an Email will be sent to you confirming your admission to the Program.
Step 3: Block your Seat & Begin the Prep Course
Block your seat with a payment of INR 10,000 to enroll into the program. Begin with your Prep course and start your Data Science journey!

Program Fee

No Cost EMI options are also available. *

What's Included in the Price

Features/Benefits
  • Industry recognized certificate from IBM
  • Access to 15 real life projects and a capstone project
  • IBM Watson labs and $1200 equivalent Cloud Credits

I’m interested in this program

By clicking Start Application, you agree to our terms and conditions and our privacy policy.

Career Impact

Over 500 Careers Transformed
Average Salary Hike
Highest Salary
Jobs Sourced
Hiring Partners

Frequently Ask Questions

Yes, you will get a certificate from DataTrained for the course completion as well as a project completion certificate from DataTrained.
There are two types of projects:

a. Practice projects: Your mentor will first do 2-3 projects for you and then you will do the next 3-4 projects wherein you will get help from your mentor and on tickets.

b. Evaluation projects: Once you’re done with the practice projects, you get access to the evaluation projects.
Data Science doesn’t need any previous technical or programming experience. We will teach you Math, Stats and programming at a very beginner level.
No, the program is designed in such a way that, you can continue with your job along with this program. It will be a mix of pre-recorded videos, live classes as well as printed study material. Every topic would be project-based and will be taught as per the live market scenario. The course module will be covered under the guidance of Industry Experts.
There are two training modes:

a. Self-paced: You will get access to DataTrained and IBM joint LMS wherein you will be assigned courses and projects. You will need to go through these courses and complete the projects as your own pace. Mentor support will be provided.

b. Blended: You will get access to DataTrained and IBM joint LMS wherein you will be assigned courses and projects. You will need to go through these courses and complete the projects as your own pace. In addition to these courses, live online classes are conducted on Saturdays and Sundays for you. Mentor support will be provided.
In case you miss a class, you need not to worry. All the live classes’ recordings will be available on your LMS. You can watch and practice the concepts at your own time.
We have partnered with analyticsjobs.in for the placement assistance for our learners who successfully completes our programs. Analytics Jobs is a leading media and job portal company specifically aimed for the jobs in Data Science, Analytics, Automation, RPA, Cloud, Block Chain and computer science.