Companies Home Search Profile

Introduction to Spark SQL and DataFrames

Focused View

Dan Sullivan

1:53:25

113 View
  • 01 - Apache Spark SQL and data analysis.mp4
    00:58
  • 02 - What you should know.mp4
    00:28
  • 01 - Introduction to DataFrames.mp4
    02:14
  • 02 - SQL for DataFrames.mp4
    02:03
  • 01 - Install Spark.mp4
    03:33
  • 02 - Install PySpark.mp4
    00:26
  • 03 - Using Jupyter notebooks with PySpark.mp4
    03:02
  • 01 - Set up a Jupyter notebook.mp4
    02:01
  • 02 - Load data into DataFrames CSV Files.mp4
    07:26
  • 03 - Load data into DataFrames JSON Files.mp4
    03:16
  • 04 - Basic DataFrame operations.mp4
    03:26
  • 05 - Filter data with DataFrame API.mp4
    02:13
  • 06 - Aggregate data with DataFrame API.mp4
    03:47
  • 07 - Sample data from DataFrames.mp4
    05:25
  • 08 - Save data from DataFrames.mp4
    03:27
  • 01 - Querying DataFrames with SQL.mp4
    04:25
  • 02 - Filtering DataFrames with SQL.mp4
    05:55
  • 03 - Aggregating Data with SQL.mp4
    05:19
  • 04 - Joining DataFrames with SQL.mp4
    05:40
  • 05 - Eliminating duplicates in DataFrames.mp4
    05:35
  • 06 - Working with NA values in DataFrames.mp4
    05:44
  • 01 - Exploratory data analysis with DataFrames.mp4
    07:13
  • 02 - Exploratory data analysis with Spark SQL.mp4
    05:07
  • 03 - Timeseries analysis with DataFrames.mp4
    10:48
  • 04 - Basic machine learning with DataFrames, part 1.mp4
    07:23
  • 05 - Basic machine learning with DataFrames, part 2.mp4
    05:49
  • 01 - Next steps.mp4
    00:42
  • Description


    Explore DataFrames, a widely used data structure in Apache Spark. DataFrames allow Spark developers to perform common data operations, such as filtering and aggregation, as well as advanced data analysis on large collections of distributed data. With the addition of Spark SQL, developers have access to an even more popular and powerful query language than the built-in DataFrames API. In this course, instructor Dan Sullivan shows how to perform basic operations—loading, filtering, and aggregating data in DataFrames—with the API and SQL, as well as more advanced techniques that are easily performed in SQL. In this section of the course, Dan explains how to join data, eliminate duplicates, and deal with null or NA values. The lessons conclude with three in-depth examples of using DataFrames for data science: exploratory data analysis, time series analysis, and machine learning.

    More details


    User Reviews
    Rating
    0
    0
    0
    0
    0
    average 0
    Total votes0
    Focused display
    Dan Sullivan
    Dan Sullivan
    Instructor's Courses
    Cloud and data architect with extensive experience in data architecture, data science, machine learning, stream processing, and cloud architecture. Capable of starting with vague initiatives and formulating precise objectives, strategies, and implementation plans. Regularly works with C-level and VP executives while also mentoring and coaching software engineers. Adapts well to unforeseen challenges. He is the author of the official Google Cloud study guides for the Professional Architect, Professional Data Engineer, and Associate Cloud Engineer.
    LinkedIn Learning is an American online learning provider. It provides video courses taught by industry experts in software, creative, and business skills. It is a subsidiary of LinkedIn. All the courses on LinkedIn fall into four categories: Business, Creative, Technology and Certifications. It was founded in 1995 by Lynda Weinman as Lynda.com before being acquired by LinkedIn in 2015. Microsoft acquired LinkedIn in December 2016.
    • language english
    • Training sessions 27
    • duration 1:53:25
    • English subtitles has
    • Release Date 2023/11/18