Companies Home Search Profile

Master Data Engineering using GCP Data Analytics

Focused View

Durga Viswanatha Raju Gadiraju,Asasri Manthena

10:58:22

117 View
  • 1. Introduction to Data Engineering using GCP Data Analytics.mp4
    04:17
  • 2. Pre-requisites for Data Engineering using GCP Data Analytics.mp4
    01:56
  • 3. Highlights of the Data Engineering using GCP Data Analytics Course.mp4
    02:47
  • 4. Overview of Udemy Platform to take course effectively.mp4
    07:39
  • 5. Refund Policy and Request for Rating and Feedback.mp4
    01:34
  • 1. Review Data Engineering on GCP Folder.mp4
    03:04
  • 2. Setup VS Code Workspace for Data Engineering on GCP.mp4
    03:16
  • 3. Setup and Integrate Python 3.9 venv with VS Code Workspace.mp4
    04:31
  • 1. Introduction to Getting Started with GCP.mp4
    00:59
  • 2. Pre-requisite Skills to Sign up for course on GCP Data Analytics.mp4
    02:03
  • 3. Overview of Cloud Platforms.mp4
    04:00
  • 4. Overview of Google Cloud Platform or GCP.mp4
    03:20
  • 5. Overview of Signing for GCP Account.mp4
    01:42
  • 6. Create New Google Account using Non Gmail Id.mp4
    02:17
  • 7. Sign up for GCP using Google Account.mp4
    03:16
  • 8. Overview of GCP Credits.mp4
    03:36
  • 9. Overview of GCP Project and Billing.mp4
    02:11
  • 10. Overview of Google Cloud Shell.mp4
    03:28
  • 11. Install Google Cloud SDK on Windows.mp4
    04:38
  • 12. Initialize gcloud CLI using GCP Project.mp4
    03:25
  • 13. Reinitialize Google Cloud Shell with Project id.mp4
    03:04
  • 14. Overview of Analytics Services on GCP.mp4
    02:20
  • 15. Conclusion to Get Started with GCP for Data Engineering.mp4
    01:05
  • 1. Getting Started with Google Cloud Storage or GCS.mp4
    02:53
  • 2. Overview of Google Cloud Storage or GCS Web UI.mp4
    05:25
  • 3. Upload Folders and Files using into GCS Bucket using GCP Web UI.mp4
    02:28
  • 4. Review GCS Buckets and Objects using gsutil commands.mp4
    04:19
  • 5. Delete GCS Bucket using Web UI.mp4
    00:59
  • 6. Setup Data Repository in Google Cloud Shell.mp4
    02:17
  • 7. Overview of Data Sets.mp4
    02:58
  • 8. Managing Buckets in GCS using gsutil.mp4
    04:26
  • 9. Copy Data Sets into GCS using gsutil.mp4
    04:02
  • 10. Cleanup Buckets in GCS using gsutil.mp4
    05:19
  • 11. Exercise to Manage Buckets and Files in GCS using gsutil.mp4
    00:45
  • 12. Overview of Setting up Data Lake using GCS.mp4
    00:57
  • 13. Setup Google Cloud Libraries in Python Virtual Environment.mp4
    03:38
  • 14. Setup Bucket and Files in GCS using gsutil.mp4
    04:30
  • 15. Getting Started to manage files in GCS using Python.mp4
    03:28
  • 16. Setup Credentials for Python and GCS Integration.mp4
    02:02
  • 17. Review Methods in Google Cloud Storage Python library.mp4
    02:45
  • 18. Get GCS Bucket Details using Python.mp4
    06:01
  • 19. Manage Blobs or Files in GCS using Python.mp4
    10:11
  • 20. Project Problem Statement to Manage Files in GCS using Python.mp4
    01:31
  • 21. Design to Upload multiple files into GCS using Python.mp4
    02:46
  • 22. Get File Names to upload into GCS using Python glob and os.mp4
    02:32
  • 23. Upload all Files to GCS as blobs using Python.mp4
    04:55
  • 24. Validate Files or Blobs in GCS using Python.mp4
    04:49
  • 25. Overview of Processing Data in GCS using Pandas.mp4
    07:07
  • 26. Convert Data to Parquet and Write to GCS using Pandas.mp4
    04:55
  • 27. Design to Upload multiple files into GCS using Pandas.mp4
    02:23
  • 28. Get File Names to upload into GCS using Python glob and os.mp4
    03:12
  • 29. Overview of Parquet File Format and Schemas JSON File.mp4
    04:49
  • 30. Get Column Names for Dataset using Schemas JSON File.mp4
    07:22
  • 31. Upload all Files to GCS as Parquet using Pandas.mp4
    06:10
  • 32. Perform Validation of Files Copied using Pandas.mp4
    05:19
  • 1. Overview of GCP Cloud SQL.mp4
    04:35
  • 2. Setup Postgres Database Server using GCP Cloud SQL.mp4
    04:12
  • 3. Configure Network for Cloud SQL Postgres Database.mp4
    05:38
  • 4. Validate Client Tools for Postgres on Mac or PC.mp4
    02:12
  • 5. Setup Database in GCP Cloud SQL Postgres Database Server.mp4
    06:06
  • 6. Setup Tables in GCP Cloud SQL Postgres Database.mp4
    04:08
  • 7. Validate Data in GCP Cloud SQL Postgres Database Tables.mp4
    02:49
  • 8. Integration of GCP Cloud SQL Postgres with Python.mp4
    07:30
  • 9. Overview of Integration of GCP Cloud SQL Postgres with Pandas.mp4
    04:19
  • 10. Read Data From Files to Pandas Data Frame.mp4
    07:24
  • 11. Process Data using Pandas Dataframe APIs.mp4
    05:21
  • 12. Write Pandas Dataframe into Postgres Database Table.mp4
    06:58
  • 13. Validate Data in Postgres Database Tables using Pandas.mp4
    05:24
  • 14. Getting Started with Secrets using GCP Secret Manager.mp4
    03:22
  • 15. Configure Access to GCP Secret Manager via IAM Roles.mp4
    04:01
  • 16. Install Google Cloud Secret Manager Python Library.mp4
    01:13
  • 17. Get Secret Details from GCP Secret Manager using Python.mp4
    06:52
  • 18. Connect to Database using Credentials from Secret Manager.mp4
    05:13
  • 19. Stop GCP Cloud SQL Postgres Database Server.mp4
    03:47
  • 1. Getting Started with GCP Dataproc.mp4
    03:57
  • 2. Setup Single Node Dataproc Cluster for Development.mp4
    05:14
  • 3. Validate SSH Connectivity to Master Node of Dataproc Cluster.mp4
    04:58
  • 4. Allocate Static IP to the Master Node VM of Dataproc Cluster.mp4
    04:45
  • 5. Setup VS Code Remote Window for Dataproc VM.mp4
    04:31
  • 6. Setup Workspace using VS Code on Dataproc.mp4
    02:08
  • 7. Getting Started with HDFS Commands on Dataproc.mp4
    04:06
  • 8. Recap of gsutil to manage files and folders in GCS.mp4
    03:37
  • 9. Review Data Sets setup on Dataproc Master Node VM.mp4
    01:47
  • 10. Copy Local Files into HDFS on Dataproc.mp4
    04:21
  • 11. Copy GCS Files into HDFS on Dataproc.cmproj.mp4
    04:02
  • 12. Validate Pyspark CLI in Dataproc Cluster.mp4
    04:35
  • 13. Validate Spark Scala CLI in Dataproc Cluster.mp4
    03:56
  • 14. Validate Spark SQL CLI in Dataproc Cluster.mp4
    03:38
  • 1. Overview of GCP Dataproc Jobs and Workflow.mp4
    04:52
  • 2. Setup JSON Dataset in GCS for Dataproc Jobs.mp4
    02:55
  • 3. Review Spark SQL Commands used for Dataproc Jobs.mp4
    07:13
  • 4. Run Dataproc Job using Spark SQL.mp4
    04:08
  • 5. Overview of Modularizing Spark SQL Applications for Dataproc.mp4
    03:26
  • 6. Review Spark SQL Scripts for Dataproc Jobs and Workflows.mp4
    04:32
  • 7. Validate Spark SQL Script for File Format Conversion.mp4
    07:00
  • 8. Exercise to convert file format using Spark SQL Script.mp4
    02:20
  • 9. Validate Spark SQL Script for Daily Product Revenue.mp4
    05:22
  • 10. Develop Spark SQL Script to Cleanup Databases.mp4
    03:56
  • 11. Copy Spark SQL Scripts to GCS.mp4
    01:52
  • 12. Run and Validate Spark SQL Scripts in GCS.mp4
    09:51
  • 13. Limitations of Running Spark SQL Scripts using Dataproc Jobs.mp4
    04:45
  • 14. Manage Dataproc Clusters using gcloud Commands.mp4
    05:17
  • 15. Run Dataproc Jobs using Spark SQL Command or Query.mp4
    05:57
  • 16. Run Dataproc Jobs using Spark SQL Scripts.mp4
    09:24
  • 17. Exercises to Run Spark SQL Scripts as Dataproc Jobs using gcloud.mp4
    02:06
  • 18. Delete Dataproc Jobs using gcloud commands.mp4
    03:00
  • 19. Importance of using gcloud commands to manage dataproc jobs.mp4
    01:57
  • 20. Getting Started with Dataproc Workflow Templates using Web UI.mp4
    08:41
  • 21. Review Steps and Design to create Dataproc Workflow Template.mp4
    07:40
  • 22. Create Dataproc Workflow Template and Add Cluster using gcloud Commands.mp4
    07:44
  • 23. Review gcloud Commands to Add Jobs to Dataproc Workflow Templates.mp4
    07:40
  • 24. Add Jobs to Dataproc Workflow Template using Commands.mp4
    05:21
  • 25. Instantiate Dataproc Workflow Template to run the Data Pipeline.mp4
    05:10
  • 26. Overview of Dataproc Operations and Deleting Workflow Runs.mp4
    05:12
  • 27. Run and Validate ELT Data Pipeline using Dataproc.mp4
    06:10
  • 28. Stop Dataproc Cluster.mp4
    02:09
  • 1. Signing up for Databricks on GCP.mp4
    04:21
  • 2. Create Databricks Workspace on GCP.mp4
    04:46
  • 3. Getting Started with Databricks Clusters on GCP.mp4
    03:17
  • 4. Getting Started with Databricks Notebook.mp4
    03:51
  • 5. Overview of Databricks on GCP.mp4
    07:11
  • 6. Overview of Databricks CLI Commands.mp4
    03:12
  • 7. Limitations of Managing DBFS using Databricks CLI.mp4
    03:48
  • 8. Overview of Copying Data Sets into DBFS on GCS.mp4
    02:50
  • 9. Create Folder in GCS using DBFS Commands.mp4
    04:13
  • 10. Upload Data Set into DBFS using GCS Web UI.mp4
    04:08
  • 11. Copy Data Set into DBFS using gsutil.mp4
    02:41
  • 12. Process Data in DBFS using Databricks Spark SQL.mp4
    05:03
  • 13. Getting Started with Spark SQL Example using Databricks.mp4
    04:44
  • 14. Create Temporary Views using Spark SQL.mp4
    06:34
  • 15. Exercise to create temporary views using Spark SQL.mp4
    01:27
  • 16. Spark SQL Query to compute Daily Product Revenue.mp4
    06:10
  • 17. Save Query Result to DBFS using Spark SQL.mp4
    04:25
  • 18. Overview of Pyspark Examples on Databricks.cmproj.mp4
    01:04
  • 19. Process Schema Details in JSON using Pyspark.mp4
    07:32
  • 20. Create Dataframe with Schema from JSON File using Pyspark.mp4
    06:03
  • 21. Transform Data using Spark APIs.mp4
    04:13
  • 22. Get Schema Details for all Data Sets using Pyspark.mp4
    04:08
  • 23. Convert CSV to Parquet with Schema using Pyspark.mp4
    05:01
  • 1. Overview of Databricks Workflows.mp4
    03:10
  • 2. Pass Arguments to Databricks Python Notebooks.mp4
    03:16
  • 3. Pass Arguments to Databricks SQL Notebooks.mp4
    03:16
  • 4. Create and Run First Databricks Job.mp4
    07:31
  • 5. Run Databricks Jobs and Tasks with Parameters.mp4
    05:40
  • 6. Create and Run Orchestrated Pipeline using Databricks Job.mp4
    06:53
  • 7. Import ELT Data Pipeline Applications into Databricks Environment.mp4
    02:56
  • 8. Spark SQL Application to Cleanup Database and Datasets.mp4
    03:52
  • 9. Review File Format Converter Pyspark Code.mp4
    05:11
  • 10. Review Databricks SQL Notebooks for Tables and Final Results.mp4
    03:57
  • 11. Validate Applications for ELT Pipeline using Databricks.mp4
    07:36
  • 12. Build ELT Pipeline using Databricks Job in Workflows.mp4
    09:22
  • 13. Run and Review Execution details of ELT Data Pipeline using Databricks Job.mp4
    05:00
  • Description


    Learn GCS for Data Lake, BigQuery for Data Warehouse, GCP Dataproc and Databricks for Big Data Pipelines

    What You'll Learn?


    • Data Engineering leveraging Services under GCP Data Analytics
    • Setup Development Environment using Visual Studio Code on Windows
    • Building Data Lake using GCS
    • Process Data in the Data Lake using Python and Pandas
    • Build Data Warehouse using Google BigQuery
    • Loading Data into Google BigQuery tables using Python and Pandas
    • Setup Development Environment using Visual Studio Code on Google Dataproc with Remote Connection
    • Big Data Processing or Data Engineering using Google Dataproc
    • Run Spark SQL based applications as Dataproc Jobs using Commands
    • Build Spark SQL based ELT Data Pipelines using Google Dataproc Workflow Templates
    • Run or Instantiate ELT Data Pipelines or Dataproc Workflow Template using gcloud dataproc commands
    • Big Data Processing or Data Engineering using Databricks on GCP
    • Integration of GCS and Databricks on GCP
    • Build and Run Spark based ELT Data Pipelines using Databricks Workflows on GCP
    • Integration of Spark on Dataproc with Google BigQuery
    • Build and Run Spark based ELT Pipeline using Google Dataproc Workflow Template with BigQuery Integration

    Who is this for?


  • Beginner or Intermediate Data Engineers who want to learn GCP Analytics Services for Data Engineering
  • Intermediate Application Engineers who want to explore Data Engineering using GCP Analytics Services
  • Data and Analytics Engineers who want to learn Data Engineering using GCP Analytics Services
  • Testers who want to learn key skills to test Data Engineering applications built using AWS Analytics Services
  • More details


    Description

    Data Engineering is all about building Data Pipelines to get data from multiple sources into Data Lakes or Data Warehouses and then from Data Lakes or Data Warehouses to downstream systems. As part of this course, I will walk you through how to build Data Engineering Pipelines using GCP Data Analytics Stack. It includes services such as Google Cloud Storage, Google BigQuery, GCP Dataproc, Databricks on GCP, and many more.

    • As part of this course, first you will go ahead and setup environment to learn using VS Code on Windows and Mac.

    • Once the environment is ready, you need to sign up for Google Cloud Account. We will provide all the instructions to sign up for Google Cloud Account including reviewing billing as well as getting USD 300 Credit.

    • We typically use Cloud Object Storage as Data Lake. As part of this course, you will learn how to use Google Cloud Storage as Data Lake along with how to manage the files in Google Cloud Storage both by using commands as well as Python. It also covers, integration of Pandas with files in Google Cloud Storage.

    • GCP provides RDBMS as service via Cloud SQL. You will learn how to setup Postgresql Database Server using Cloud SQL. Once the Database Server is setup, you will also take care of setting up required application database and user. You will also understand how to develop Python based applications by integrating with GCP Secretmanager to retrieve the credentials.

    • One of the key usage of Data is nothing but building reports and dashboards. Typically reports and dashboards are built using reporting tools pointing to Data Warehouse. As part of Google Data Analytics Services, BigQuery can be used as Data Warehouse. You will learn the features of BigQuery as a Data Warehouse along with key integrations using Python and Pandas.

    • At times, we need to process heavy volumes of data which also known as Big Data Processing. GCP Dataproc is a fully manage Big Data Service with Hadoop, Spark, Kafka, etc. You will not only learn how to setup the GCP Dataproc cluster, but also you will learn how to use single node Dataproc cluster for the development. You will setup development environment using VS Code with remote connection to the Dataproc Cluster.

    • Once you understand how to get started with Big Data Processing using Dataproc, you will take care of building end to end ELT Data Pipelines using Dataproc Workflow Templates. You will learn all key commands to submit Dataproc Jobs as well as Workflows. You will end up building ELT Pipelines using Spark SQL.

    • While Dataproc is GCP Native Big Data Service, Databricks is another prominent Big Data Service available in GCP. You will also understand how to get started with Databricks on GCP.

    • Once you go through the details about how to get started with Databricks on GCP, you will take care of building end to end ELT Datapipelins using Databricks Jobs and Workflows.

    • Towards the end of the course you should be fairly comfortable with BigQuery for Data Warehouse and GCP Dataproc for Data Processing, you will learn how to integrate these two key services by building end to end ELT Data Pipeline using Dataproc Workflow. You will also understand how to include Pyspark based application with Spark BigQuery connector as part of the Pipeline.

    • In the process of building Data Pipelines, you will also revise application development life cycle of Spark, troubleshooting issues related to the spark using relevant web interfaces such as YARN Timeline Server, Spark UI, etc.

    Who this course is for:

    • Beginner or Intermediate Data Engineers who want to learn GCP Analytics Services for Data Engineering
    • Intermediate Application Engineers who want to explore Data Engineering using GCP Analytics Services
    • Data and Analytics Engineers who want to learn Data Engineering using GCP Analytics Services
    • Testers who want to learn key skills to test Data Engineering applications built using AWS Analytics Services

    User Reviews
    Rating
    0
    0
    0
    0
    0
    average 0
    Total votes0
    Focused display
    Durga Viswanatha Raju Gadiraju
    Durga Viswanatha Raju Gadiraju
    Instructor's Courses
    20+ years of experience in executing complex projects using a vast array of technologies including Big Data and the Cloud.ITVersity, Inc. - is a US-based organization that provides quality training for IT professionals and we have a track record of training hundreds of thousands of professionals globally.Building an IT career for people with required tools such as high-quality material, labs, live support, etc to upskill and cross-skill is paramount for our organization.At this time our training offerings are focused on the following areas:* Application Development using Python and SQL* Big Data and Business Intelligence* Cloud* Datawarehousing, Databases
    Asasri Manthena
    Asasri Manthena
    Instructor's Courses
    3+ years of overall experience and currently working with ITVersity, Inc. An MBA graduate from SDA Bocconi School of Management.ITVersity, Inc. - is a US-based organization that provides quality training for IT professionals and we have a track record of training hundreds of thousands of professionals globally.Building an IT career for people with required tools such as high-quality material, labs, live support, etc to upskill and cross-skill is paramount for our organization.At this time our training offerings are focused on the following areas:* Application Development using Python and SQL* Big Data and Business Intelligence* Cloud* Datawarehousing, Databases
    Students take courses primarily to improve job-related skills.Some courses generate credit toward technical certification. Udemy has made a special effort to attract corporate trainers seeking to create coursework for employees of their company.
    • language english
    • Training sessions 152
    • duration 10:58:22
    • Release Date 2022/12/11