Companies Home Search Profile

Applied ML: Intro to Analytics with Pandas and PySpark

Focused View

Sanjana Sahayaraj

55:30

9 View
  • 1. Introduction to Instructor and Course.mp4
    01:27
  • 2.1 AppliedML-Development-Environment.pdf
  • 2. Scope of the Course and Development Environment.mp4
    06:41
  • 1. Pandas and Pyspark libraries.mp4
    03:54
  • 2.1 Using_pandas_and_pyspark.pdf
  • 2. Load and inspect data.mp4
    08:13
  • 1. Filter out null and duplicate values.mp4
    09:09
  • 2.1 Cleaning_Data.pdf
  • 2. Filter out malformed entries.mp4
    09:14
  • 1. Importance of human and domain knowledge.mp4
    02:22
  • 2.1 Analyzing_Data.pdf
  • 2. Analyze domain specific themes.mp4
    14:30
  • Description


    Hands-on training to analyze and prepare data for Machine Learning using Pandas, Pyspark and SQL

    What You'll Learn?


    • Get hands-on experience with the data preprocessing
    • Understand the practical differences between tools such as Pandas and Pyspark
    • Understand when to use Pandas vs PySpark
    • Understand the exploration steps required for Data Science and Machine Learning

    Who is this for?


  • Developers and Analysts curious about the various data maniputation, transformation and analytics tools available in Python
  • What You Need to Know?


  • Having a laptop or system to develop and execute code to learn
  • Introduction to Programming and Python
  • More details


    Description

    Exploring and preparing data is a huge step in the Machine Learning and Data Science lifecycle as I've already mentioned in my other course "Applied ML: The Big Picture". Being such a crucial foundational step in the lifecycle, it's important to learn all the tools at your disposal and get a practical understanding on when to choose which tool.


    This course will teach the hands-on techniques to perform several stages in data processing, exploration and transformation, alongside visualization. It will also expose the learner to various scenarios, helping them differentiate and choose between the tools in the real world projects.


    Within each tool, we will cover a variety of techniques and their specific purpose in data analysis and manipulation on real datasets. Those who wish to learn by practice will require a system with Python development environment to get hands-on training.


    For someone who has already had the practice, this course can serve as a refresher on the various tools and techniques, to make sure you are using the right combination of tools and techniques for the given problem at hand. And likewise, be extended to interview preparations to refresh memory on best data practices for ML and Data Science in Python.


    Who this course is for:

    • Developers and Analysts curious about the various data maniputation, transformation and analytics tools available in Python

    User Reviews
    Rating
    0
    0
    0
    0
    0
    average 0
    Total votes0
    Focused display
    Sanjana Sahayaraj
    Sanjana Sahayaraj
    Instructor's Courses
    Hey everyone, I started my career as an NLP Research Engineer with an MNC and later pivoted to more product based roles as a Senior Data Scientist, and got into the startup world as well, where I got to own and apply data and ML at various stages. This gave me a holistic view on how the research and theory in ML space can translate to real business results, when applied the right way. This is the practical knowledge I share on Udemy. In addition, I also run a podcast by the name "Data Epoch" on Spotify, Apple Podcasts and Youtube for the Data and ML community to come together and learn from each other.
    Students take courses primarily to improve job-related skills.Some courses generate credit toward technical certification. Udemy has made a special effort to attract corporate trainers seeking to create coursework for employees of their company.
    • language english
    • Training sessions 8
    • duration 55:30
    • Release Date 2024/02/09