Companies Home Search Profile

Prophecy Data Transformation Copilot for Data Engineering

Focused View

4:51:26

0 View
  • 001 Welcome to Prophecy for Data Engineering on Databricks and Spark.mp4
    04:59
  • 001 Whats the future of data transformation.mp4
    00:56
  • 002 The evolution of data transformation.mp4
    02:05
  • 003 Ideal data transformation solution for the cloud.mp4
    05:50
  • 004 Prophecy and the future of data transformation.mp4
    02:50
  • 005 How to build the ideal data transformation in the cloud.mp4
    01:07
  • 001 What is a data lake and the difference between a data lake and data warehouses.mp4
    04:00
  • 002 Introducing data lakehouse and why its the perfect solution.mp4
    02:01
  • 001 Meet your instructor and module overview.mp4
    03:54
  • 002 Apache Spark architecture and concepts.mp4
    09:22
  • 003 Spark language and tooling.mp4
    07:44
  • 004 From Apache Spark to Databricks - why are they different.mp4
    13:06
  • 005 Data lakehouse, unity catalog, optimization and security.mp4
    08:44
  • 006 Working with Spark best practices.mp4
    05:49
  • 007 Spark and Databricks tips and tricks.mp4
    05:18
  • 001 Prophecy Overview - lets learn together!.mp4
    01:44
  • 002 Setting up a Databricks Fabric to execute our Pipelines.mp4
    03:53
  • 003 Create a Prophecy Project to manage our Spark code.mp4
    03:17
  • 004 Getting started with the Pipeline canvas.mp4
    05:40
  • 005 Explore code view and perform simple aggregations.mp4
    01:58
  • 006 Join accounts and opportunities data and write results to a delta table.mp4
    02:33
  • 007 Create a Pipeline and read from Data Sources to start building our Pipeline.mp4
    05:40
  • 008 Deploying Pipelines to production to run our scheduled Pipelines.mp4
    03:46
  • 009 Introduction to Prophecy Users and Teams.mp4
    02:05
  • 001 Data Sources and Targets overview.mp4
    00:38
  • 002 Parse and read raw data from object store with best practices.mp4
    02:32
  • 003 Prophecy built-in Data Sources and Data Sets.mp4
    02:11
  • 004 Explore Data Source default options.mp4
    02:13
  • 005 Read and parse source parquet data and merge schema.mp4
    04:20
  • 006 Handle corrupt and malformed records when reading from object stores.mp4
    02:51
  • 007 Additional options to handle corrupt and malformed reocrds.mp4
    02:41
  • 008 Work with source data schema and delimiters.mp4
    02:38
  • 009 Read from delta tables as sources.mp4
    01:05
  • 010 Write data to a delta table using a target Gem.mp4
    01:55
  • 011 Partition data when writing to a delta table for optimal performance.mp4
    02:09
  • 012 What weve learned in this module.mp4
    01:51
  • 001 Data lakehouse and the medallion architecture module overview.mp4
    02:09
  • 002 Medallion architecture - bronze, silver, and gold layer characteristics.mp4
    03:05
  • 003 Read and write data by partition - daily load from object storage.mp4
    02:34
  • 004 Additional data load by partition - daily load from object storage.mp4
    01:17
  • 005 Introduction to data models in a data lakehouse.mp4
    04:08
  • 006 Write the bronze layer data to delta tables.mp4
    02:00
  • 007 Introduction to Slowly Changing Dimensions (SCD).mp4
    01:43
  • 008 Implement simple SCD2 for bronze layer table.mp4
    06:50
  • 009 Bulk load read and write options.mp4
    00:57
  • 010 Bulk load historical data with SCD2.mp4
    05:41
  • 011 Delta table data versioning.mp4
    05:28
  • 012 Work with incompatible schemas.mp4
    04:23
  • 013 Recover data from a previous version.mp4
    02:07
  • 014 A summary of what weve learned in this module.mp4
    00:33
  • 001 Building the Silver and Gold layers - Overview.mp4
    03:25
  • 002 Data integration and cleaning in the Silver layer.mp4
    02:05
  • 003 Build a data model and integrate data in the Silver layer.mp4
    03:17
  • 004 Implement SCD2 in the silver layer.mp4
    04:16
  • 005 Generating unique IDs and write data to delta tables.mp4
    02:52
  • 006 Business requirements for the Gold layer.mp4
    01:27
  • 007 Perform analytics in the Gold layer to build business reports.mp4
    03:09
  • 008 Using subgraphs for reusability to simplify Pipelines.mp4
    01:54
  • 009 A summary of what weve learned in this module.mp4
    00:48
  • 001 Pipeline deployment overview.mp4
    00:49
  • 002 Ways to orchestrate workflows to automate jobs.mp4
    01:50
  • 003 Configure incremental Pipeline to prepare for scheduled runs.mp4
    02:03
  • 004 Create a Prophecy Job to schedule the Pipelines to run daily.mp4
    04:04
  • 005 What is CICD and how to deploy Pipelines to production.mp4
    02:42
  • 006 Advanced use cases integrate with external CICD process using PBT.mp4
    04:01
  • 007 A summary of what weve learned in this module.mp4
    00:26
  • external-links.txt
  • 001 Version management and change control overview.mp4
    00:40
  • 002 Prophecy Projects and the git process.mp4
    02:20
  • 003 Collaborating on a Pipeline - catching dev branch to the main branch.mp4
    01:34
  • 004 Reverting changes when developing a Pipeline before committing.mp4
    01:11
  • 005 Reverting back to a prior commit after committing by using rollback.mp4
    00:50
  • 006 Merging changes and switching between branches.mp4
    01:48
  • 007 Resolving code conflicts with multiple team members are making commits.mp4
    02:18
  • 008 Cloning an exiting Prophecy Project to a new repository.mp4
    02:11
  • 009 Reusing an existing Prophecy Project by importing the Project.mp4
    01:19
  • 010 Creating pull requests and handling commit conflicts.mp4
    03:49
  • 011 A summary of what weve learned in this module.mp4
    00:32
  • 001 Reusability and extensibility overview.mp4
    01:34
  • 002 The importance of setting data engineering standards - reuse and extend.mp4
    03:16
  • 003 Convert a script to a customized Gem to share and reuse.mp4
    02:03
  • 004 Create a new Gem for multi-dimensional cube using the specified express.mp4
    02:57
  • 005 Create an UI for the cube Gem for users to define the cube.mp4
    02:01
  • 006 Adding additional features to make the customized Gem UI intuitive.mp4
    01:27
  • 007 Error handling with adding validations and customized error messages.mp4
    01:59
  • 008 Testing customized cube Gem and publishing the Gem to share with others.mp4
    01:57
  • 009 Assigning proper access to share the newly built cube Gem.mp4
    01:41
  • 010 Use the newly created cube Gem by adding it a dependency.mp4
    05:50
  • 011 A summary of what weve learned in this module.mp4
    00:30
  • external-links.txt
  • 001 Data quality and unit testing overview.mp4
    01:25
  • 002 Medallion architecture and data quality.mp4
    02:32
  • 003 Data quality Pipeline walkthrough - how to populate data quality log.mp4
    03:40
  • 004 Silver layer data quality checks, define errors, and write to delta table.mp4
    03:52
  • 005 Data integration quality checks with joins - check if customer IDs are missing.mp4
    01:18
  • 006 Performing data reconciliation checks - identify mismatching column values.mp4
    04:20
  • 007 Identifying and tracking data quality issues by drilling down to a specific ID.mp4
    01:07
  • 008 Executing data quality checks in phases - stop the pipeline if error exists.mp4
    02:35
  • 009 Unit testing options - testing expressions using output equality.mp4
    03:09
  • 010 Explore code view of the unit test.mp4
    01:12
  • 011 Running the unit tests.mp4
    01:32
  • 012 Unit testing expressions using output predicates.mp4
    02:51
  • 013 A summary of what weve learned in this module.mp4
    00:38
  • Description


    Learn Databricks and Spark data engineering to deliver self-service data transformation and speed pipeline development

    What You'll Learn?


    • Learn and design the data lakehouse paradigm for a e-commerce company
    • Hands-on lab environment is provided with this course
    • Implement and deploy a medallion architecture using Prophecy running on Databricks
    • Understand Apache Spark and its best practices with real-life use cases
    • Share and extend Pipeline components with data practitioners and analysts
    • Deploy Pipelines to production and CI/CD and best practices
    • Utilize version control and change management in data engineering
    • Deploy data quality checks and unit tests

    Who is this for?


  • data engineers, data scientists, data analysts, data architects, data leads, data engineering leads
  • What You Need to Know?


  • No programming experience needed. You will utilize low-code UI to build a real-life data implementation
  • More details


    Description

    This course is designed to help data engineers and analysts to build and deploy a cloud data lakehouse architectu using Prophecy's Data Transformation Copilot. It is created with the intention of helping you embark on your data engineering journey with Spark and Prophecy.

    We will start by staging the ingested data from application platforms like Salesforce, operational databases with CDC transactional data, and machine generated data like logs and metrics. We’re going to clean and normalize the ingested tables to prepare a complete, clean, and efficient data model. From that data model, we’re going to build four projects creating consumption applications for different real-world use-cases. With each of the projects, you’re going to learn something new:


    1. We will build a spreadsheet export for your finance department, where we will explore data modeling and transformation concepts. Since the finance department really cares about the quality of data, we’re going to also learn about how to setup unit and integration tests to maintain high quality.

    2. We will create an alerting system for your operational support team to ensure customer success, where we’re going to learn about orchestration best practices.

    3. Sales data upload that can be ingested back to Salesforce, where we will explore advanced extensibility concepts that will allows us to create and follow standardized practices.

    4. A dashboard directly on Databricks for your product team to monitor live usage. Here we we learn the a lot about observability and data quality.

    The best part? All of the code that will be building is completely open-source and accessible. You will be able to apply everything your learn here in your real projects.

    Our entire team of best in-class data engineers and architects with tons of experience from companies like Salesforce, Databricks, and Instagram are going to walk you through, step by step, building out these use-cases.

    Who this course is for:

    • data engineers, data scientists, data analysts, data architects, data leads, data engineering leads

    User Reviews
    Rating
    0
    0
    0
    0
    0
    average 0
    Total votes0
    Focused display
    Category
    Students take courses primarily to improve job-related skills.Some courses generate credit toward technical certification. Udemy has made a special effort to attract corporate trainers seeking to create coursework for employees of their company.
    • language english
    • Training sessions 101
    • duration 4:51:26
    • English subtitles has
    • Release Date 2025/01/13