Data Engineering with Google Datafusion and Big Query (CDAP)
Cassio Alessandro de Bolba
3:08:15
Description
Your first steps in Data Engineering with Google Datafusion, a low-code tool with an open-source version (CDAP)
What You'll Learn?
- Understand a bit more Google Cloud Resources
- Use Google Datafusion as ETL tool
- Data Engineering Low Code
- ETL
- Create Data Pipelines and DAGs
- Read and Write data on Google Big Query
- Read and Write data on Google Cloud Storage
- Data Transformations with low code and queries
- Some Advanced SQL commands
Who is this for?
What You Need to Know?
More details
DescriptionThis is an INTRODUCTORY course to Google Cloud's low-code ingestion tool, Datafusion. Google Data Fusion is a fully managed data integration platform that allows data engineers to efficiently create, deploy, and manage data pipelines.
One of the main reasons to use Google Data Fusion is its ease of use. With an intuitive and visual interface, data engineers can create complex data pipelines without the need for extensive coding. The drag-and-drop interface simplifies the process of data transformation and cleansing, allowing professionals to focus on business logic rather than worrying about detailed coding.
Another significant benefit of Google Data Fusion is its scalability. The platform runs on Google Cloud, which means it can handle large volumes of data and high-performance parallel processing. Data engineers can vertically or horizontally expand their processing capabilities according to project needs, ensuring they can handle any data demand at scale.
Furthermore, Google Data Fusion seamlessly integrates with other services and products in the Google Cloud ecosystem. Data engineers can easily connect and integrate data pipelines with services such as BigQuery, Cloud Storage, Pub/Sub, and many others. This enables a cohesive and unified data architecture, facilitating data ingestion, storage, and analysis across multiple platforms.
In this course, you will learn:
Understanding its internal workings.
What its benefits are.
How to create a Datafusion instance.
Using Google Cloud Storage as data input.
Using BigQuery as a Data Lake (Bronze and Silver layers).
Advanced features of BigQuery: Partitioned tables and MERGE command.
Ingesting data from different sources.
Transforming data with Wrangle (low code) and queries.
Creating DAGs for data ETL (Extract, Transform, Load) and dependencies.
Scheduling and inter-DAG dependencies.
Who this course is for:
- Data Engineers
- Data Analysts
- Data Scientists
- Analytics Engineer
- Low Code Developers
- Python Developers looking to reduce coding overhead
- Open Source Fans
This is an INTRODUCTORY course to Google Cloud's low-code ingestion tool, Datafusion. Google Data Fusion is a fully managed data integration platform that allows data engineers to efficiently create, deploy, and manage data pipelines.
One of the main reasons to use Google Data Fusion is its ease of use. With an intuitive and visual interface, data engineers can create complex data pipelines without the need for extensive coding. The drag-and-drop interface simplifies the process of data transformation and cleansing, allowing professionals to focus on business logic rather than worrying about detailed coding.
Another significant benefit of Google Data Fusion is its scalability. The platform runs on Google Cloud, which means it can handle large volumes of data and high-performance parallel processing. Data engineers can vertically or horizontally expand their processing capabilities according to project needs, ensuring they can handle any data demand at scale.
Furthermore, Google Data Fusion seamlessly integrates with other services and products in the Google Cloud ecosystem. Data engineers can easily connect and integrate data pipelines with services such as BigQuery, Cloud Storage, Pub/Sub, and many others. This enables a cohesive and unified data architecture, facilitating data ingestion, storage, and analysis across multiple platforms.
In this course, you will learn:
Understanding its internal workings.
What its benefits are.
How to create a Datafusion instance.
Using Google Cloud Storage as data input.
Using BigQuery as a Data Lake (Bronze and Silver layers).
Advanced features of BigQuery: Partitioned tables and MERGE command.
Ingesting data from different sources.
Transforming data with Wrangle (low code) and queries.
Creating DAGs for data ETL (Extract, Transform, Load) and dependencies.
Scheduling and inter-DAG dependencies.
Who this course is for:
- Data Engineers
- Data Analysts
- Data Scientists
- Analytics Engineer
- Low Code Developers
- Python Developers looking to reduce coding overhead
- Open Source Fans
User Reviews
Rating
Cassio Alessandro de Bolba
Instructor's Courses
Udemy
View courses Udemy- language english
- Training sessions 27
- duration 3:08:15
- Release Date 2023/07/11