Databricks
Databricks Logo

Optimizing Apache Spark™ on Databricks 

  • Offered byDatabricks

Optimizing Apache Spark™ on Databricks
 at 
Databricks 
Overview

To equip students with the skills to manage and allocate resources effectively within the Databricks environment, ensuring that Spark jobs run efficiently without wasting computational power

Duration

12 hours

Total fee

1.26 Lakh

Mode of learning

Online

Official Website

Go to Website External Link Icon

Credential

Certificate

Optimizing Apache Spark™ on Databricks
Table of content
Accordion Icon V3
  • Overview
  • Highlights
  • Course Details
  • Curriculum

Optimizing Apache Spark™ on Databricks
 at 
Databricks 
Highlights

  • Earn a certificate from Databricks
  • Learn from industry experts
Details Icon

Optimizing Apache Spark™ on Databricks
 at 
Databricks 
Course details

More about this course

In this course, you will explore the five key problems that represent the vast majority of performance issues in an Apache Spark application: skew, spill, shuffle, storage, and serialization

With examples based on 100 GB to 1+ TB datasets, you will investigate and diagnose sources of bottlenecks with the Spark UI and learn effective mitigation strategies

You will also discover new features introduced in Spark 3 that can automatically address common performance problems

Lastly, you learn how to design and configure clusters for optimal performance based on specific team needs and concerns

Optimizing Apache Spark™ on Databricks
 at 
Databricks 
Curriculum

Day 1

Review of Spark architecture and Spark UI    

Skew    

Spill    

Shuffle    

Storage    

Serialization

 

Day 2

Ingestion basics    

Predicate push downs    

Disk partitioning    

Z-ordering    

Bucketing    

Optimization with Adaptive Query Execution (AQE)    

Designing and configuring clusters for high-performance

Other courses offered by Databricks

1.27 L
16 hours
– / –
– / –
– / –
– / –
– / –
– / –
– / –
84.83 K
8 hours
– / –
View Other 32 CoursesRight Arrow Icon
qna

Optimizing Apache Spark™ on Databricks
 at 
Databricks 

Student Forum

chatAnything you would want to ask experts?
Write here...