Cloudera Data Scientist Certification Training Course Overview

Enroll for our 4-day Cloudera Data Scientist training from Koenig Solutions accredited by Cloudera.  In this course you will learn enterprise data science and machine learning using Apache Spark in Cloudera Data Science Workbench (CDSW).

Through a blend of hands-on labs and interactive lectures, you will learn to use Spark SQL to load, explore, cleanse, join, and analyze data and Spark MLlib to specify, train, evaluate, tune, and deploy machine learning pipelines. They dive into the foundations of the Spark architecture and execution model necessary to effectively configure, monitor, and tune their Spark applications. Participants also learn how Spark integrates with key components of the Cloudera platform such as HDFS, YARN, Hive, Impala, and Hue as well as their favorite Python or R packages.

Target Audience:

This course is intended for Data scientists, Data engineers, data analysts, developers, and solution architects

Learning Objectives:

After completing this course, you will be able to:

  • How to use Apache Spark to run data science and machine learning workflows at scale
  • How to use Spark SQL and DataFrames to work with structured data
  • How to use MLlib, Spark’s machine learning library
  • How to use PySpark, Spark’s Python API
  • How to use sparklyr, a dplyr-compatible R interface to Spark
  • How to use Cloudera Data Science Workbench (CDSW)
  • How to use other Cloudera platform components including HDFS, Hive,Impala, and Hue

 

Cloudera Data Scientist (32 Hours) Download Course Contents

Live Online Training
Group Training 1950
20 - 23 Dec GTR 09:00 AM - 05:00 PM CST
(8 Hours/Day)
17 - 20 Jan 09:00 AM - 05:00 PM CST
(8 Hours/Day)
1-on-1 Training (GTR) 2250
4 Hours
8 Hours
Week Days
Weekend

Start Time : At any time

12 AM
12 PM

GTR=Guaranteed to Run
Classroom Training (Available: London, Dubai, India, Sydney, Vancouver)
Duration : On Request
Fee : On Request
On Request
Buy Flexi For Only $99 (Online Training Re-imagined)

Request More Information

Course Prerequisites

Participants should have a basic understanding of Python or R and some experience exploring and analyzing data and developing statistical or machine learning models. Knowledge of Hadoop or Spark is not required.