Prophecy Data Transformation Copilot for Data Engineering

Focused View

4:51:26

0 View

01 - A warm welcome from Prophecys co-founder

001 Welcome to Prophecy for Data Engineering on Databricks and Spark.mp4

04:59

02 - The future of data transformation

001 Whats the future of data transformation.mp4

00:56

002 The evolution of data transformation.mp4

02:05

003 Ideal data transformation solution for the cloud.mp4

05:50

004 Prophecy and the future of data transformation.mp4

02:50

005 How to build the ideal data transformation in the cloud.mp4

01:07

03 - Data lakes, warehouses, and lakehouses - when to use what (Optional)

001 What is a data lake and the difference between a data lake and data warehouses.mp4

04:00

002 Introducing data lakehouse and why its the perfect solution.mp4

02:01

04 - Introduction to Spark and Databricks (Optional)

001 Meet your instructor and module overview.mp4

03:54

002 Apache Spark architecture and concepts.mp4

09:22

003 Spark language and tooling.mp4

07:44

004 From Apache Spark to Databricks - why are they different.mp4

13:06

005 Data lakehouse, unity catalog, optimization and security.mp4

08:44

006 Working with Spark best practices.mp4

05:49

007 Spark and Databricks tips and tricks.mp4

05:18

05 - Getting started with Prophecy

001 Prophecy Overview - lets learn together!.mp4

01:44

002 Setting up a Databricks Fabric to execute our Pipelines.mp4

03:53

003 Create a Prophecy Project to manage our Spark code.mp4

03:17

004 Getting started with the Pipeline canvas.mp4

05:40

005 Explore code view and perform simple aggregations.mp4

01:58

006 Join accounts and opportunities data and write results to a delta table.mp4

02:33

007 Create a Pipeline and read from Data Sources to start building our Pipeline.mp4

05:40

008 Deploying Pipelines to production to run our scheduled Pipelines.mp4

03:46

009 Introduction to Prophecy Users and Teams.mp4

02:05

06 - Data Sources and Targets

001 Data Sources and Targets overview.mp4

00:38

002 Parse and read raw data from object store with best practices.mp4

02:32

003 Prophecy built-in Data Sources and Data Sets.mp4

02:11

004 Explore Data Source default options.mp4

02:13

005 Read and parse source parquet data and merge schema.mp4

04:20

006 Handle corrupt and malformed records when reading from object stores.mp4

02:51

007 Additional options to handle corrupt and malformed reocrds.mp4

02:41

008 Work with source data schema and delimiters.mp4

02:38

009 Read from delta tables as sources.mp4

01:05

010 Write data to a delta table using a target Gem.mp4

01:55

011 Partition data when writing to a delta table for optimal performance.mp4

02:09

012 What weve learned in this module.mp4

01:51

07 - Data Lakehouse Architecture

001 Data lakehouse and the medallion architecture module overview.mp4

02:09

002 Medallion architecture - bronze, silver, and gold layer characteristics.mp4

03:05

003 Read and write data by partition - daily load from object storage.mp4

02:34

004 Additional data load by partition - daily load from object storage.mp4

01:17

005 Introduction to data models in a data lakehouse.mp4

04:08

006 Write the bronze layer data to delta tables.mp4

02:00

007 Introduction to Slowly Changing Dimensions (SCD).mp4

01:43

008 Implement simple SCD2 for bronze layer table.mp4

06:50

009 Bulk load read and write options.mp4

00:57

010 Bulk load historical data with SCD2.mp4

05:41

011 Delta table data versioning.mp4

05:28

012 Work with incompatible schemas.mp4

04:23

013 Recover data from a previous version.mp4

02:07

014 A summary of what weve learned in this module.mp4

00:33

08 - Building the Silver and Gold Layers

001 Building the Silver and Gold layers - Overview.mp4

03:25

002 Data integration and cleaning in the Silver layer.mp4

02:05

003 Build a data model and integrate data in the Silver layer.mp4

03:17

004 Implement SCD2 in the silver layer.mp4

04:16

005 Generating unique IDs and write data to delta tables.mp4

02:52

006 Business requirements for the Gold layer.mp4

01:27

007 Perform analytics in the Gold layer to build business reports.mp4

03:09

008 Using subgraphs for reusability to simplify Pipelines.mp4

01:54

009 A summary of what weve learned in this module.mp4

00:48

09 - Deploying Pipelines to production

001 Pipeline deployment overview.mp4

00:49

002 Ways to orchestrate workflows to automate jobs.mp4

01:50

003 Configure incremental Pipeline to prepare for scheduled runs.mp4

02:03

004 Create a Prophecy Job to schedule the Pipelines to run daily.mp4

04:04

005 What is CICD and how to deploy Pipelines to production.mp4

02:42

006 Advanced use cases integrate with external CICD process using PBT.mp4

04:01

007 A summary of what weve learned in this module.mp4

00:26

external-links.txt

10 - Managing versions and change control

001 Version management and change control overview.mp4

00:40

002 Prophecy Projects and the git process.mp4

02:20

003 Collaborating on a Pipeline - catching dev branch to the main branch.mp4

01:34

004 Reverting changes when developing a Pipeline before committing.mp4

01:11

005 Reverting back to a prior commit after committing by using rollback.mp4

00:50

006 Merging changes and switching between branches.mp4

01:48

007 Resolving code conflicts with multiple team members are making commits.mp4

02:18

008 Cloning an exiting Prophecy Project to a new repository.mp4

02:11

009 Reusing an existing Prophecy Project by importing the Project.mp4

01:19

010 Creating pull requests and handling commit conflicts.mp4

03:49

011 A summary of what weve learned in this module.mp4

00:32

11 - Reusability and extensibility

001 Reusability and extensibility overview.mp4

01:34

002 The importance of setting data engineering standards - reuse and extend.mp4

03:16

003 Convert a script to a customized Gem to share and reuse.mp4

02:03

004 Create a new Gem for multi-dimensional cube using the specified express.mp4

02:57

005 Create an UI for the cube Gem for users to define the cube.mp4

02:01

006 Adding additional features to make the customized Gem UI intuitive.mp4

01:27

007 Error handling with adding validations and customized error messages.mp4

01:59

008 Testing customized cube Gem and publishing the Gem to share with others.mp4

01:57

009 Assigning proper access to share the newly built cube Gem.mp4

01:41

010 Use the newly created cube Gem by adding it a dependency.mp4

05:50

011 A summary of what weve learned in this module.mp4

00:30

external-links.txt

12 - Data testing

001 Data quality and unit testing overview.mp4

01:25

002 Medallion architecture and data quality.mp4

02:32

003 Data quality Pipeline walkthrough - how to populate data quality log.mp4

03:40

004 Silver layer data quality checks, define errors, and write to delta table.mp4

03:52

005 Data integration quality checks with joins - check if customer IDs are missing.mp4

01:18

006 Performing data reconciliation checks - identify mismatching column values.mp4

04:20

007 Identifying and tracking data quality issues by drilling down to a specific ID.mp4

01:07

008 Executing data quality checks in phases - stop the pipeline if error exists.mp4

02:35

009 Unit testing options - testing expressions using output equality.mp4

03:09

010 Explore code view of the unit test.mp4

01:12

011 Running the unit tests.mp4

01:32

012 Unit testing expressions using output predicates.mp4

02:51

013 A summary of what weve learned in this module.mp4

00:38

Description

Learn Databricks and Spark data engineering to deliver self-service data transformation and speed pipeline development

What You'll Learn?

Learn and design the data lakehouse paradigm for a e-commerce company
Hands-on lab environment is provided with this course
Implement and deploy a medallion architecture using Prophecy running on Databricks
Understand Apache Spark and its best practices with real-life use cases
Share and extend Pipeline components with data practitioners and analysts
Deploy Pipelines to production and CI/CD and best practices
Utilize version control and change management in data engineering
Deploy data quality checks and unit tests

Who is this for?

data engineers, data scientists, data analysts, data architects, data leads, data engineering leads

What You Need to Know?

No programming experience needed. You will utilize low-code UI to build a real-life data implementation

More details

Description
This course is designed to help data engineers and analysts to build and deploy a cloud data lakehouse architectu using Prophecy's Data Transformation Copilot. It is created with the intention of helping you embark on your data engineering journey with Spark and Prophecy.
We will start by staging the ingested data from application platforms like Salesforce, operational databases with CDC transactional data, and machine generated data like logs and metrics. Weâ€™re going to clean and normalize the ingested tables to prepare a complete, clean, and efficient data model. From that data model, weâ€™re going to build four projects creating consumption applications for different real-world use-cases. With each of the projects, youâ€™re going to learn something new:

We will build a spreadsheet export for your finance department, where we will explore data modeling and transformation concepts. Since the finance department really cares about the quality of data, weâ€™re going to also learn about how to setup unit and integration tests to maintain high quality.
We will create an alerting system for your operational support team to ensure customer success, where weâ€™re going to learn about orchestration best practices.
Sales data upload that can be ingested back to Salesforce, where we will explore advanced extensibility concepts that will allows us to create and follow standardized practices.
A dashboard directly on Databricks for your product team to monitor live usage. Here we we learn the a lot about observability and data quality.
The best part? All of the code that will be building is completely open-source and accessible. You will be able to apply everything your learn here in your real projects.
Our entire team of best in-class data engineers and architects with tons of experience from companies like Salesforce, Databricks, and Instagram are going to walk you through, step by step, building out these use-cases.
Who this course is for:
data engineers, data scientists, data analysts, data architects, data leads, data engineering leads

User Reviews

Rating

average 0

Total votes0

Focused display

Data Science

Udemy

View courses Udemy

Students take courses primarily to improve job-related skills.Some courses generate credit toward technical certification. Udemy has made a special effort to attract corporate trainers seeking to create coursework for employees of their company.