×
Categories
Blockchain
Get started
Cloud Computing & DevOps
Get started
× Mentor Giri Job Boosters DTGyan Register Today

Scala Programming for Data Science in collaboration with IBM

  1. 4.5
  2. (535 ratings)
  • 1200+ Learners
  • English

Scala Programming for Data Science in collaboration with IBM

  1. 19999
    • Best online HR Management course
    • Best online HR Management course
    • Best online HR Management course
    • Best online HR Management course
    1200+ Learners
Features
  • 100+ hours of learning
  • Practice Test Included
  • Certificate of completion
  • Skill level

What you'll learn

You will learn to make use of the Scala language to access databases, clean, analyze, and visualize data with Scala. Through our guided lectures and access to labs, you will get hands-on experience tackling fascinating data issues. This's an action-packed learning path for data science enthusiasts who wish to work on real-life problems with Scala.

Universally Recognized Certificates

From IBM and DataTrained

Capstone and Real Life Projects

Access to 15 real life projects and a capstone project

Analytics Jobs Placement Assistance

Access to analyticsjobs.in curated jobs

Access to in-demand Tools

IBM Watson labs and $1200 equivalent Cloud Credits

Programming Languages and Tools Covered

Learning Path

Learn the foundations of the language for developers and data scientists interested in using Scala for data analysis. Tackle data analysis problems involving Big Data, Scala and Spark. Get a solid understanding of the fundamentals of the language, the tooling, and the development process. Develop a good appreciation of more advanced features.

Course Content
Module 1 - Introduction
  • Introduction to Scala
  • Creating a Scala Doc
  • Creating a Scala Project
  • The Scala REPL
  • Scala Documentation
Module 3 - Case Objects and Classes
  • Companion Objects
  • Case Classes and Case Objects
  • Apply and Unapply
  • Synthetic Methods
  • Immutability and Thread Safety
Module 5 - Idiomatic Scala
  • 1. For expressions
  • 2. Pattern Matching
  • 3. Handling Options
  • 4. Handling Failures
  • 5. Handling Futures
Module 2 - Basic Object Oriented Programming
  • Classes
  • Immutable and Mutable Fields
  • Methods
  • Default and Named Arguments
  • Objects
Module 4 - Collections
  • Collections overview
  • Sequences and Sets
  • Options
  • Tuples and Maps
  • Higher Order Functions

Learn the history of Apache Spark™, how to build applications with Spark, how to establish an understanding of RDDs and Data Frames, and other advanced Spark topics.

  • Be prepared to leverage the core RDD and DataFrame APIs to perform analytics on datasets with Scala.
  • Get an overview of Spark and its associated ecosystem.
  • Gain enough skills to leverage the Map-Reduce framework with the Scala language.
Course Content
Module 1 - What is Spark? Module 3 - Introduction to Data Frames Module 5 - Introduction to Spark MLlib
Module 2 - Introduction to RDDs Module 4 - Advanced Spark Topics

In this course you will learn about Basic statistics and data types, Preparing data, Feature engineering, Fitting a model and Pipelines and grid search. Apache Spark™ is a fast and general engine for large-scale data processing, with built-in modules for streaming, machine learning and graph processing. This course shows you how to use Sparks' machine learning pipelines to fit models and search for optimal hyperparameters using a Spark cluster.

Course Content
Module 1 - Basic Statistics and Data Types
  • Vectors and Labelled Points
  • Local and Distributed Matrices
  • Summary Statistics, Correlations, and Random Data
  • Sampling
  • Hypothesis Testing
Module 2 - Preparing Data
  • Statistics, Random data and Sampling on Data Frames
  • Handling Missing Data and Imputing Values
  • Transformers and Estimators
  • Data Normalization
  • Identifying Outliers
Module 3 - Feature Engineering
  • Feature Vectors
  • Categorical Features
  • Using Explode, User Defined Functions, and Pivot
  • Principal Component Analysis (PCA) in Feature Engineering
  • Formulas
Module 5 - Pipeline and Grid Search
  • Predicting Grant Applications: Introduction
  • Predicting Grant Applications: Creating Features
  • Predicting Grant Applications: Building a Pipeline
  • Predicting Grant Applications: Cross Validation and Model
  • Tuning Predicting Grant Applications: Wrapping up
Module 2 - Preparing Data
  • Statistics, Random data and Sampling on Data Frames
  • Handling Missing Data and Imputing Values
  • Transformers and Estimators
  • Data Normalization
  • Identifying Outliers
Module 4 - Fitting a Model
  • Decision Trees
  • Random Forests
  • Gradient-Boosting Trees
  • Linear Methods
  • Evaluation

Comprehensive Curriculum

The curriculum has been designed by faculty from IITs, IBM and Expert Industry Professionals.

100+ Hours of Content--- best institute for data science lucknow
100+

Hours of Content

80+ Live Sessions--  data science training lucknow
80+

Live Sessions

15 Tools and Software- data science institute lucknow
15

Tools and Software

Scala Programming for Data Science in collaboration with IBM

Get eligible for 3 world-class certifications thus adding that extra edge to your resume.

  • Alumni Status
  • Learning paths and certification from IBM
  • Course completion certificate from DataTrained Education
  • Project completion certificate from DataTrained Education

Instructors

Learn from India’s leading Software Engineering faculty and Industry leaders

Saeed Aghabozorgi - Data Scientist, IBM
Saeed Aghabozorgi
Data Scientist, IBM

Saeed Aghabozorgi, PhD is a Data Scientist in IBM with a track record of developing enterprise level applications that substantially increases clients’ ability to turn data into actionable knowledge.

Rav Ahuja - Senior Manager, IBM
Rav Ahuja
Senior Manager, IBM

Rav Ahuja is a Senior Manager with IBM Canada Lab specializing in AI, Data Science and Big Data analytics. He is part of the Emerging Technologies team and is involved in incubating solutions for Data Scientists and Analytics Professionals.

Raul F. Chong- Senior Program Manager, IBM
Raul F. Chong
Senior Program Manager, IBM

Raul F. Chong is a Senior Program Manager based at the IBM Toronto Laboratory. Raul joined IBM in 1997 and has held numerous positions in the company. Raul has taught many DB2 workshops, has published numerous articles, and has contributed to the DB2 Certification exam tutorials.

Grant Hutchison - Senior Engineer, IBM
Grant Hutchison
Senior Engineer, IBM

Grant Hutchison had worked with IBM for 18 years as Senior Engineer and Manager at IBM Canada (1991-2009). He held various roles including: Software Development, Support, Quality Assurance, Marketing, Sales, Training, and Product Management.