Companies Home Search Profile

Getting Started with Apache Spark on Databricks

Focused View

Janani Ravi

1:52:09

16 View
  • 01. Course Overview.mp4
    02:09
  • 02. Prequisites and Course Outline.mp4
    02:26
  • 03. Introducing Apache Spark.mp4
    05:21
  • 04. Spark Architecture.mp4
    04:46
  • 05. Introducing Databricks.mp4
    02:59
  • 06. Databricks Science and Engineering Concepts.mp4
    06:35
  • 07. Azure Databricks Architectural Overview.mp4
    04:15
  • 08. Demo-Creating an Azure Databricks Workspace.mp4
    02:35
  • 09. Demo-Provisionsing an All Purpose Cluster.mp4
    05:16
  • 10. RDDs and Data Frames.mp4
    06:45
  • 11. Spark APIs.mp4
    02:07
  • 12. Demo-dbutils.mp4
    03:28
  • 13. Demo-Transformations and Actions on RDDs.mp4
    04:47
  • 14. Demo-Transformations and Actions on Data Frames.mp4
    02:52
  • 15. Demo-Uploading a Dataset to DBFS Using Notebooks.mp4
    03:42
  • 16. Demo-Basic Selection and Filtering Operations.mp4
    03:51
  • 17. Demo-Writing CSV Files out to DBFS.mp4
    03:40
  • 18. Demo-Creating a Table Using the Databricks UI.mp4
    02:19
  • 19. Demo-Visualizing Data Using the Display Command.mp4
    03:10
  • 20. Demo-Exploring Databricks Visualizations.mp4
    05:01
  • 21. Demo-Reading and Parsing JSON Data.mp4
    05:38
  • 22. Demo-Accessing Nested Fields and List Elements.mp4
    05:24
  • 23. Demo-Setting up an Azure Storage Account.mp4
    02:50
  • 24. Demo-Storing Secrets in the Azure Key Vault.mp4
    01:52
  • 25. Demo-Reading from Azure Data Storage.mp4
    05:40
  • 26. Demo-Basic SQL Transformations.mp4
    05:18
  • 27. Demo-Built-in Functions.mp4
    06:02
  • 28. Summary and Next Steps.mp4
    01:21
  • Description


    This course will introduce you to analytical queries and big data processing using Apache Spark on Azure Databricks. You will learn how to work with Spark transformations, actions, visualizations, and functions using the Databricks Runtime.

    What You'll Learn?


      Azure Databricks allows you to work with big data processing and queries using the Apache Spark unified analytics engine. With Azure Databricks you can set up your Apache Spark environment in minutes, autoscale your processing, and collaborate and share projects in an interactive workspace.

      In this course, Getting Started with Apache Spark on Databricks, you will learn the components of the Apache Spark analytics engine which allows you to process batch as well as streaming data using a unified API. First, you will learn how the Spark architecture is configured for big data processing, you will then learn how the Databricks Runtime on Azure makes it very easy to work with Apache Spark on the Azure Cloud Platform and will explore the basic concepts and terminology for the technologies used in Azure Databricks.

      Next, you will learn the workings and nuances of Resilient Distributed Datasets also known as RDDs which is the core data structure used for big data processing in Apache Spark. You will see that RDDs are the data structures on top of which Spark Data frames are built. You will study the two types of operations that can be performed on Data frames - namely transformations and actions and understand the difference between them. You’ll also learn how Databricks allows you to explore and visualize your data using the display() function that leverages native Python libraries for visualizations.

      Finally, you will get hands-on experience with big data processing operations such as projection, filtering, and aggregation operations. Along the way, you will learn how you can read data from an external source such as Azure Cloud Storage and how you can use built-in functions in Apache Spark to transform your data.

      When you are finished with this course you will have the skills and ability to work with basic transformations, visualizations, and aggregations using Apache Spark on Azure Databricks.

    More details


    User Reviews
    Rating
    0
    0
    0
    0
    0
    average 0
    Total votes0
    Focused display
    Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework. After spending years working in tech in the Bay Area, New York, and Singapore at companies such as Microsoft, Google, and Flipkart, Janani finally decided to combine her love for technology with her passion for teaching. She is now the co-founder of Loonycorn, a content studio focused on providing high-quality content for technical skill development. Loonycorn is working on developing an engine (patent filed) to automate animations for presentations and educational content.
    Pluralsight, LLC is an American privately held online education company that offers a variety of video training courses for software developers, IT administrators, and creative professionals through its website. Founded in 2004 by Aaron Skonnard, Keith Brown, Fritz Onion, and Bill Williams, the company has its headquarters in Farmington, Utah. As of July 2018, it uses more than 1,400 subject-matter experts as authors, and offers more than 7,000 courses in its catalog. Since first moving its courses online in 2007, the company has expanded, developing a full enterprise platform, and adding skills assessment modules.
    • language english
    • Training sessions 28
    • duration 1:52:09
    • level preliminary
    • Release Date 2023/12/15