Data Pipelines with Snowflake and Streamlit
Marcos Vinicius Oliveira
5:17:27
Description
Using Snowflake to data engineer Kaggle and Google Trends data with Python procedures and tasks
What You'll Learn?
- Setup Snowflake and AWS Accounts
- Work with Kaggle and SerpAPI
- Download and manipulate data with Jupyter Notebooks on VS Code
- Work with External Access Integration and Storage Integration on Snowflake
- Create Snowflake Python based procedures
- Create Snowflake tasks
- Create Streamlit apps inside of Snowflake
Who is this for?
What You Need to Know?
More details
DescriptionThis course focuses on building a data engineering pipeline that integrates multiple data sources, including Kaggle datasets and Google Trends data (fetched via SerpAPI), to analyze the relationship between Netflix show releases and the popularity of actors. You'll learn to gather and combine data on Netflix actors and their trends on Google, particularly in the weeks following a show's release.
You will use Kaggle as a source for the Netflix shows and actors dataset and Google Trends (accessed via SerpAPI) to fetch real-time search data for the actors. This data will be stored and processed within the Snowflake database, leveraging its cloud-native architecture for optimal scalability and performance.
Technical Stack Overview:
Snowflake Database: The central repository for storing and querying data.
Streamlit in Snowflake: A web app framework to visualize the data directly inside Snowflake.
AWS S3: For data storage and retrieval, particularly for intermediate datasets.
Snowflake Python Procedures: Automating data manipulation and pipeline processes.
Snowflake External Access & Storage Integrations: Managing secure access to external services and storage.
By the end of the course, you'll have a fully functional data pipeline that processes and combines streaming data, cloud storage, and APIs for trend analysis, visualized through an interactive Streamlit app within Snowflake.
Who this course is for:
- Data Engineers looking to get proficient on Snowflake and Streamlit for building data pipelines
This course focuses on building a data engineering pipeline that integrates multiple data sources, including Kaggle datasets and Google Trends data (fetched via SerpAPI), to analyze the relationship between Netflix show releases and the popularity of actors. You'll learn to gather and combine data on Netflix actors and their trends on Google, particularly in the weeks following a show's release.
You will use Kaggle as a source for the Netflix shows and actors dataset and Google Trends (accessed via SerpAPI) to fetch real-time search data for the actors. This data will be stored and processed within the Snowflake database, leveraging its cloud-native architecture for optimal scalability and performance.
Technical Stack Overview:
Snowflake Database: The central repository for storing and querying data.
Streamlit in Snowflake: A web app framework to visualize the data directly inside Snowflake.
AWS S3: For data storage and retrieval, particularly for intermediate datasets.
Snowflake Python Procedures: Automating data manipulation and pipeline processes.
Snowflake External Access & Storage Integrations: Managing secure access to external services and storage.
By the end of the course, you'll have a fully functional data pipeline that processes and combines streaming data, cloud storage, and APIs for trend analysis, visualized through an interactive Streamlit app within Snowflake.
Who this course is for:
- Data Engineers looking to get proficient on Snowflake and Streamlit for building data pipelines
User Reviews
Rating
Marcos Vinicius Oliveira
Instructor's Courses
Udemy
View courses Udemy- language english
- Training sessions 38
- duration 5:17:27
- Release Date 2025/01/23