Apache Spark Essential Training: Big Data Engineering (2021)

Focused View

Kumaran Ponnambalam

1:04:33

0 View

01 - Introduction

01 - Driving big data engineering with Apache Spark.mp4

00:48

02 - Course prerequisites.mp4

01:18

03 - Setting up the exercise files.mp4

06:02

02 - 1. Data Engineering Concepts

01 - What is data engineering.mp4

01:34

02 - Data engineering vs. data analytics vs. data science.mp4

01:19

03 - Data engineering functions.mp4

03:09

04 - Batch vs. real-time processing.mp4

02:15

05 - Data engineering with Spark.mp4

01:07

03 - 2. Spark Capabilities for ETL

01 - Spark architecture review.mp4

02:10

02 - Parallel processing with Spark.mp4

03:09

03 - Spark execution plan.mp4

01:11

04 - Stateful stream processing.mp4

02:18

05 - Spark analytics and ML.mp4

01:58

04 - 3. Batch Processing Pipelines

01 - Batch processing use case Problem statement.mp4

01:43

02 - Batch processing use case Design.mp4

01:44

03 - Setting up the local DB.mp4

01:53

04 - Uploading stock to a central store.mp4

03:51

05 - Aggregating stock across warehouses.mp4

02:46

05 - 4. Real-Time Processing Pipelines

01 - Real-time use case Problem.mp4

01:50

02 - Real-time use case Design.mp4

01:41

03 - Generating a visits data stream.mp4

01:48

04 - Building a website analytics job.mp4

02:50

05 - Executing the real-time pipeline.mp4

02:18

06 - 5. Data Engineering with Spark Best Practices

01 - Batch vs. real-time options.mp4

02:18

02 - Scaling extraction and loading operations.mp4

02:18

03 - Scaling processing operations.mp4

01:01

04 - Building resiliency.mp4

01:19

07 - 6. End-to-End Exercise Project

01 - Project exercise requirements.mp4

01:56

02 - Solution design.mp4

00:56

03 - Extracting long last actions.mp4

01:40

04 - Building a scorecard.mp4

01:40

08 - Conclusion

01 - More about Apache Spark.mp4

00:43

Description

Data engineering is the foundation for building analytics and data science applications in the new Big Data world. Data engineering requires combining multiple big data technologies to construct data pipelines and networks to stream, process, and store data. This course focuses on building full-fledged solutions that combine Apache Spark with other Big Data tools to create end-to-end data pipelines. Instructor Kumaran Ponnambalam begins by defining data engineering, its functions, and its concepts. Next, Kumaran goes over how Spark capabilities such as parallel processing, execution plans, state management options, and machine learning work with extract, transform, load (ETL). He introduces you to batch processing use cases and processes, as well as real-time processing pipelines. After walking you through several useful best practices, Kumaran concludes with an end-to-end exercise project.

More details

User Reviews

Rating

average 0

Total votes0

Focused display

Apache Spark

Kumaran Ponnambalam

Instructor's Courses

A seasoned veteran in everything data, with a reputation for delivering high performance database and SaaS applications and currently specializing in leading Big Data Science and Engineering efforts

Linkedin Learning

View courses Linkedin Learning

LinkedIn Learning is an American online learning provider. It provides video courses taught by industry experts in software, creative, and business skills. It is a subsidiary of LinkedIn. All the courses on LinkedIn fall into four categories: Business, Creative, Technology and Certifications. It was founded in 1995 by Lynda Weinman as Lynda.com before being acquired by LinkedIn in 2015. Microsoft acquired LinkedIn in December 2016.