Companies Home Search Profile

Data Engineering with Google Datafusion and Big Query (CDAP)

Focused View

Cassio Alessandro de Bolba

3:08:15

57 View
  • 1 - 11 Get to Know the Teacher.mp4
    02:07
  • 2 - 12 Get to Know the Course.mp4
    03:52
  • 3 - 13 Introduction to Google Datafusion.mp4
    08:44
  • 4 - 14 Architecture and Components.mp4
    07:26
  • 5 - 15 Creating a Datafusion Instance.mp4
    05:07
  • 6 - 16 Instance Types and Pricing.mp4
    07:13
  • 7 - 17 Understanding a Datafusion Instance.mp4
    07:35
  • 8 - 21 GCS Object Storage.mp4
    06:04
  • 9 - 22 Big Query as Datalake.mp4
    06:27
  • 10 - 23 Working with Semi Structured Data.mp4
    04:38
  • 11 - 24 Pipeline Studio and Wangler.mp4
    13:40
  • 12 - 25 Preview and Debug.mp4
    07:06
  • 13 - 26 Sinking data on Big Query.mp4
    10:04
  • 14 - ERROR Importing json pipeline from other Datafusion Instance.mp4
    05:59
  • 15 - 27 Branching the Pipeline.mp4
    09:01
  • 16 - 28 Move files.mp4
    08:55
  • 17 - 29 Big Query as Source.mp4
    05:02
  • 18 - 210 Transforming Data with Wrangler 1.mp4
    10:28
  • 19 - 211 Transforming Data with Wrangler 2.mp4
    07:45
  • 20 - 212 Transforming Data with Big Query.mp4
    04:50
  • 21 - 213 Execute Query in Datafusion.mp4
    05:49
  • 22 - 214 Data Partitioning in Big Query.mp4
    07:33
  • 23 - 215 MERGE statement.mp4
    07:28
  • 24 - 216 Delete temp Tables.mp4
    08:35
  • 25 - 217 Scheduling and Pipeline Dependencies.mp4
    05:42
  • 26 - 218 ERRO Quota DISKSTOTALGB Exceed.mp4
    05:31
  • 27 - 219 Challenge.mp4
    05:34
  • Description


    Your first steps in Data Engineering with Google Datafusion, a low-code tool with an open-source version (CDAP)

    What You'll Learn?


    • Understand a bit more Google Cloud Resources
    • Use Google Datafusion as ETL tool
    • Data Engineering Low Code
    • ETL
    • Create Data Pipelines and DAGs
    • Read and Write data on Google Big Query
    • Read and Write data on Google Cloud Storage
    • Data Transformations with low code and queries
    • Some Advanced SQL commands

    Who is this for?


  • Data Engineers
  • Data Analysts
  • Data Scientists
  • Analytics Engineer
  • Low Code Developers
  • Python Developers looking to reduce coding overhead
  • Open Source Fans
  • What You Need to Know?


  • GCP account
  • Previous exposure to SQL
  • More details


    Description

    This is an INTRODUCTORY course to Google Cloud's low-code ingestion tool, Datafusion. Google Data Fusion is a fully managed data integration platform that allows data engineers to efficiently create, deploy, and manage data pipelines.

    One of the main reasons to use Google Data Fusion is its ease of use. With an intuitive and visual interface, data engineers can create complex data pipelines without the need for extensive coding. The drag-and-drop interface simplifies the process of data transformation and cleansing, allowing professionals to focus on business logic rather than worrying about detailed coding.

    Another significant benefit of Google Data Fusion is its scalability. The platform runs on Google Cloud, which means it can handle large volumes of data and high-performance parallel processing. Data engineers can vertically or horizontally expand their processing capabilities according to project needs, ensuring they can handle any data demand at scale.

    Furthermore, Google Data Fusion seamlessly integrates with other services and products in the Google Cloud ecosystem. Data engineers can easily connect and integrate data pipelines with services such as BigQuery, Cloud Storage, Pub/Sub, and many others. This enables a cohesive and unified data architecture, facilitating data ingestion, storage, and analysis across multiple platforms.

    In this course, you will learn:

    • Understanding its internal workings.

    • What its benefits are.

    • How to create a Datafusion instance.

    • Using Google Cloud Storage as data input.

    • Using BigQuery as a Data Lake (Bronze and Silver layers).

    • Advanced features of BigQuery: Partitioned tables and MERGE command.

    • Ingesting data from different sources.

    • Transforming data with Wrangle (low code) and queries.

    • Creating DAGs for data ETL (Extract, Transform, Load) and dependencies.

    • Scheduling and inter-DAG dependencies.

    Who this course is for:

    • Data Engineers
    • Data Analysts
    • Data Scientists
    • Analytics Engineer
    • Low Code Developers
    • Python Developers looking to reduce coding overhead
    • Open Source Fans

    User Reviews
    Rating
    0
    0
    0
    0
    0
    average 0
    Total votes0
    Focused display
    Cassio Alessandro de Bolba
    Cassio Alessandro de Bolba
    Instructor's Courses
    I'm self taught Senior Data Engineer and content creator. Migrated from a machine operator at my 30's to the Data IT Industry. Can help early professionals to drive their path to become data professionals as well as give some great advices for those who wish to live abroad and achieve a sponsorship visa.My current stack:Data Integration / Processing -> Databricks | Dataflow | AWS Lambdas | Datafusion | DataFactoryAutomation -> Power Platform | Power Automate | Power AppsDatabases -> Snowflake | Big Query | SQL ServerData Transformation -> DBTVersioning / Repository -> Git | Azure DevOpsProgramming -> SQL | Python | PySparkCloud Providers -> Azure | GCP | AWS Task / Data Orchestration -> AirflowBI -> Power BI | Qlik Sense CI / CD -> Git Lab CIContainers -> Docker
    Students take courses primarily to improve job-related skills.Some courses generate credit toward technical certification. Udemy has made a special effort to attract corporate trainers seeking to create coursework for employees of their company.
    • language english
    • Training sessions 27
    • duration 3:08:15
    • Release Date 2023/07/11