Companies Home Search Profile

Predictive Analytics Using Apache Spark MLlib on Databricks

Focused View

Janani Ravi

1:57:08

13 View
  • 01. Course Overview.mp4
    02:02
  • 02. Prerequisites and Course Outline.mp4
    02:11
  • 03. Machine Learning on Apache Spark.mp4
    05:01
  • 04. Demo-Configuring the Workspace and Setting up a Notebook.mp4
    02:39
  • 05. Demo-Exploring the Diabetes Dataset.mp4
    04:04
  • 06. Demo-Standardization and Scaling.mp4
    04:48
  • 07. Demo-Normalization.mp4
    02:53
  • 08. Demo-Converting Continuous Values to Categorical Values.mp4
    02:00
  • 09. Demo-Tokenizing Text Data.mp4
    02:32
  • 10. Demo-Label Encoding and One-hot Encoding.mp4
    04:43
  • 11. Demo-Feature Selection.mp4
    05:42
  • 12. Quick Overview of Linear Regression.mp4
    04:30
  • 13. Lasso Ridge and Elastic Net Regression.mp4
    04:06
  • 14. Demo-Exploring the Life Expectancy Dataset.mp4
    04:03
  • 15. Demo-Building and Evaluating a Linear Regression Model.mp4
    06:12
  • 16. Demo-Hyperparameter Tuning.mp4
    04:19
  • 17. Quick Overview of Ensemble Learning.mp4
    02:49
  • 18. Averaging and Boosting.mp4
    01:49
  • 19. Machine Learning Pipelines.mp4
    02:51
  • 20. Demo-Exploring the CO2 Emissions Dataset.mp4
    03:44
  • 21. Demo-Random Forest Regression.mp4
    04:47
  • 22. Demo-Gradient Boosted Tree Regression.mp4
    04:38
  • 23. Quick Overview of Logistic Regression.mp4
    06:14
  • 24. Demo-Exploring the Loan Dataset.mp4
    03:22
  • 25. Demo-Logistic Regression.mp4
    04:14
  • 26. Demo-Performing Predictions on Streaming Data.mp4
    04:32
  • 27. Quick Overview of Decision Trees.mp4
    02:37
  • 28. Demo-Exploring the Bank Marketing Campaign Dataset.mp4
    02:40
  • 29. Demo-Decision Tree Classifier.mp4
    06:52
  • 30. Demo-Hyperparameter Tuning with Cross Validation.mp4
    02:44
  • 31. Summary and Further Study.mp4
    01:30
  • Description


    This course will teach you to understand and implement important techniques for predictive analytics such as regression and classification using Apache Spark MLlib APIs on Databricks.

    What You'll Learn?


      The Spark unified analytics engine is one of the most popular frameworks for big data analytics and processing. Spark offers extremely comprehensive and easy to use APIs for machine learning which you can use to build predictive models for regression and classification and pre-process data to feed into these models.

      In this course, Predictive Analytics Using Apache Spark MLlib on Databricks, you will learn to implement machine learning models using Spark ML APIs. First, you will understand the different Spark libraries available for machine learning, the older RDD-based library, and the newer DataFrame based library. You will then explore the range of transformers available in Spark for pre-processing data for machine learning - such as scaling and standardization transformers for numeric data and label encoding and one-hot encoding transformers for categorical data.

      Next, you will use linear regression and ensemble models such as random forest and gradient boosted trees to build regression models. You will use these models for prediction on batch data. In addition, you will also see how you can use Spark ML Pipelines to chain together transformers and estimators to build a complete machine learning workflow.

      Finally, you will implement classification models using logistic regression as well as decision trees. You will train the ML model using batch data but perform predictions on streaming data. You will also use hyperparameter tuning and cross-validation to find the best model for your data.

      When you’re finished with this course, you’ll have the skills and knowledge to create ML models with Spark MLlib needed to perform predictive analysis using machine learning.

    More details


    User Reviews
    Rating
    0
    0
    0
    0
    0
    average 0
    Total votes0
    Focused display
    Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework. After spending years working in tech in the Bay Area, New York, and Singapore at companies such as Microsoft, Google, and Flipkart, Janani finally decided to combine her love for technology with her passion for teaching. She is now the co-founder of Loonycorn, a content studio focused on providing high-quality content for technical skill development. Loonycorn is working on developing an engine (patent filed) to automate animations for presentations and educational content.
    Pluralsight, LLC is an American privately held online education company that offers a variety of video training courses for software developers, IT administrators, and creative professionals through its website. Founded in 2004 by Aaron Skonnard, Keith Brown, Fritz Onion, and Bill Williams, the company has its headquarters in Farmington, Utah. As of July 2018, it uses more than 1,400 subject-matter experts as authors, and offers more than 7,000 courses in its catalog. Since first moving its courses online in 2007, the company has expanded, developing a full enterprise platform, and adding skills assessment modules.
    • language english
    • Training sessions 31
    • duration 1:57:08
    • level advanced
    • Release Date 2023/12/15