Companies Home Search Profile

Data Engineering with Spark Databricks Delta Lake Lakehouse

Focused View

FutureX Skills

2:07:54

152 View
  • 1. Introduction.mp4
    02:25
  • 2. Data Engineering with Spark.mp4
    04:46
  • 3. What is Databricks.mp4
    02:19
  • 4. Creating a Databricks Community Edition account.mp4
    04:22
  • 5. Building a basic data pipeline.mp4
    00:32
  • 6.1 fx-learn-spark.zip
  • 6. Reading data from DBFS and Delta Tables.mp4
    10:59
  • 7.1 spark write save.zip
  • 7. Writing data to DBFS and Delta tables.mp4
    05:21
  • 8. Exporting and importing Notebooks.mp4
    01:25
  • 9. Revisiting the basic data pipeline.mp4
    00:51
  • 1.1 spark transformations python.zip
  • 1. More Transformations and Actions using PySpark.mp4
    07:57
  • 2.1 scala transformations.zip
  • 2. Doing the Transformations in Scala.mp4
    05:04
  • 3.1 python scala crash course.zip
  • 3. Python Scala crash course.mp4
    07:52
  • 4.1 fx udf.zip
  • 4. Spark User Defined Functions (UDF).mp4
    11:01
  • 5.1 spark joins 1.zip
  • 5.2 store customers transactions 1.zip
  • 5. Joining Datasets using DataFrame APIs and Spark SQL.mp4
    12:29
  • 6.1 spark joins.zip
  • 6.2 store customers transactions.zip
  • 6. More join operations using Spark.mp4
    04:01
  • 7. Section summary.mp4
    01:41
  • 1. Understanding Data Warehouse, Data Lake and Data Lakehouse.mp4
    07:31
  • 2. Databricks Lakehouse Architecture and Delta Lake.mp4
    04:22
  • 3. Delta Tables.mp4
    02:11
  • 4.1 spark transformations python sql.zip
  • 4. Storing data in a Delta table, Databricks SQL and time travel.mp4
    12:35
  • 5.1 databricks sql.zip
  • 5. Databricks SQL vs Spark SQL.mp4
    06:16
  • 6.1 delta table caching.zip
  • 6. Delta Table caching.mp4
    11:24
  • 7. Where to go from here.mp4
    00:30
  • Description


    Apache Spark Databricks Lakehouse Delta Lake Delta Tables Delta Caching Scala Python Data Engineering for beginners

    What You'll Learn?


    • Acquiring the necessary skills to qualify for an entry-level Data Engineering position
    • Developing a practical comprehension of Data Lakehouse concepts through hands-on experience
    • Learning to operate a Delta table by accessing its version history, recovering data, and utilizing time travel functionality
    • Optimizing a delta table with various techniques like caching, partitioning, and z-ordering for faster analytics
    • Obtaining practical knowledge in constructing a data pipeline through the usage of Apache Spark on the Databricks platform

    Who is this for?


  • Data Engineering beginners
  • More details


    Description

    Data Engineering is a vital component of modern data-driven businesses. The ability to process, manage, and analyze large-scale data sets is a core requirement for organizations that want to stay competitive. In this course, you will learn how to build a data pipeline using Apache Spark on Databricks' Lakehouse architecture. This will give you practical experience in working with Spark and Lakehouse concepts, as well as the skills needed to excel as a Data Engineer in a real-world environment.


    Throughout the course, you will learn how to conduct analytics using Python and Scala with Spark, apply Spark SQL and Databricks SQL for analytics, develop a data pipeline with Apache Spark, quickly become proficient in Databricks' community edition, manage a Delta table by accessing version history, restore data, and utilize time travel features, optimize query performance using Delta Cache, work with Delta Tables and Databricks File System, and gain insights into real-world scenarios from our experienced instructor.


    At the beginning of the course, you will start by becoming familiar with Databricks' community edition and creating a basic pipeline using Spark. This will assist you in setting up your environment and getting comfortable with the platform before progressing to more complex topics.


    Once you are familiar with the basics, you will learn how to conduct analytics with Spark using Python and Scala. This will include topics such as Spark transformations, actions, joins Spark SQL and DataFrame APIs.


    In the final section of the course, you will acquire the knowledge and skills to operate a Delta table . This will involve accessing its version history, restoring data, and utilizing time travel functionality using Spark and Databricks SQL. Additionally, you will learn how to use delta cache to optimize query performance.


    This course is designed for Data Engineering beginners with no prior knowledge of Python and Scala required. However, some familiarity with databases and SQL is necessary to succeed in this course. Upon completion, you will have the skills and knowledge required to succeed in a real-world Data Engineer role.


    Throughout the course, you will work with hands-on examples and real-world scenarios to apply the concepts you learn. By the end of the course, you will have the practical experience and skills required to understand Spark and Lakehouse concepts, and to build a scalable and reliable data pipeline using Apache Spark on Databricks' Lakehouse architecture.

    Who this course is for:

    • Data Engineering beginners

    User Reviews
    Rating
    0
    0
    0
    0
    0
    average 0
    Total votes0
    Focused display
    Category
    FutureX Skills
    FutureX Skills
    Instructor's Courses
    We are a group of Solution Architects and Developers with expertise in Java, Python, Scala , Big Data , Machine Learning and Cloud. We have years of experience in building Data and Analytics solutions for global clients.Our primary goal is to simplify learning for our students.We take a very practical use case based approach in all our courses.
    Students take courses primarily to improve job-related skills.Some courses generate credit toward technical certification. Udemy has made a special effort to attract corporate trainers seeking to create coursework for employees of their company.
    • language english
    • Training sessions 23
    • duration 2:07:54
    • English subtitles has
    • Release Date 2023/03/29

    Courses related to Apache Spark