Companies Home Search Profile

Big Data Analytics with Hadoop and Apache Spark

Focused View

Kumaran Ponnambalam

51:55

0 View
  • 01 - The combined power of Spark and Hadoop Distributed File System (HDFS).mp4
    00:42
  • 01 - Apache Hadoop overview.mp4
    01:50
  • 02 - Apache Spark overview.mp4
    00:45
  • 03 - Integrating Spark and Hadoop.mp4
    01:19
  • 04 - Using exercise files.mp4
    03:36
  • 01 - Storage formats.mp4
    02:20
  • 02 - Compression.mp4
    02:05
  • 03 - Partitioning.mp4
    02:02
  • 04 - Bucketing.mp4
    01:17
  • 05 - Best practices for data storage.mp4
    01:19
  • 01 - Reading external files into Spark.mp4
    01:46
  • 02 - Writing to HDFS.mp4
    01:26
  • 03 - Parallel writes with partitioning.mp4
    01:12
  • 04 - Parallel writes with bucketing.mp4
    01:17
  • 05 - Best practices for ingestion.mp4
    00:55
  • 01 - How Spark works.mp4
    02:59
  • 02 - Reading HDFS files with schema.mp4
    01:18
  • 03 - Reading partitioned data.mp4
    01:25
  • 04 - Reading bucketed data.mp4
    00:55
  • 05 - Best practices for data extraction.mp4
    01:08
  • 01 - Pushing down projections.mp4
    01:45
  • 02 - Pushing down filters.mp4
    01:52
  • 03 - Managing partitions.mp4
    02:32
  • 04 - Improving joins.mp4
    01:59
  • 05 - Storing intermediate results.mp4
    02:00
  • 06 - Best practices for data processing.mp4
    02:39
  • 01 - Problem definition.mp4
    01:57
  • 02 - Data loading.mp4
    01:38
  • 03 - Total score analytics.mp4
    01:02
  • 04 - Average score analytics.mp4
    00:59
  • 05 - Top student analytics.mp4
    01:12
  • 01 - Continuing on with big data analytics.mp4
    00:44
  • Description


    Apache Hadoop was a pioneer in the world of big data technologies, and it continues to lead in enterprise big data storage. Apache Spark is the top big data processing engine and provides an impressive array of features and capabilities. When used together, the Hadoop Distributed File System (HDFS) and Spark can provide a truly scalable setup for big data analytics. In this course, data analytics expert Kumaran Ponnambalam shows you how to leverage these two technologies to build scalable and optimized data analytics pipelines. Explore ways to optimize data modeling and storage on HDFS; discuss scalable data ingestion and extraction using Spark; and review actionable tips for optimizing data processing in Spark. Plus, complete a use case project that allows you to practice your new techniques.

    More details


    User Reviews
    Rating
    0
    0
    0
    0
    0
    average 0
    Total votes0
    Focused display
    Category
    Kumaran Ponnambalam
    Kumaran Ponnambalam
    Instructor's Courses
    A seasoned veteran in everything data, with a reputation for delivering high performance database and SaaS applications and currently specializing in leading Big Data Science and Engineering efforts
    LinkedIn Learning is an American online learning provider. It provides video courses taught by industry experts in software, creative, and business skills. It is a subsidiary of LinkedIn. All the courses on LinkedIn fall into four categories: Business, Creative, Technology and Certifications. It was founded in 1995 by Lynda Weinman as Lynda.com before being acquired by LinkedIn in 2015. Microsoft acquired LinkedIn in December 2016.
    • language english
    • Training sessions 32
    • duration 51:55
    • English subtitles has
    • Release Date 2024/12/06