Basics to Advanced: Azure Synapse Analytics Hands-On Project

Focused View

Shanmukh Sattiraju

18:39:55

35 View

1. Introduction

1. Introduction.mp4

06:31

2. Project Architecture.mp4

05:25

3.1 Synapse Project Deck.pdf

3. Course Slides.html

2. Origin of Azure Synapse Analytics

1. Section Introduction.mp4

00:42

2. Need of separate Analytical system.mp4

04:54

3. OLAP vs OLTP.mp4

04:02

4. A typical Datawarehouse.mp4

02:04

5. Datalake Introduction.mp4

01:54

6. Modern datawarehouse and its problem.mp4

08:06

7. The solution - Azure Synapse Analytics and its Components.mp4

04:58

8. Azure Synapse Analytics - A Single stop solution.mp4

10:18

9. Section Summary.mp4

00:36

3. Environment Setup

1. Section Introduction.mp4

00:40

2. Creating a resource group in Azure.mp4

02:45

3. Create Azure Synapse Analytics Service.mp4

06:50

4. Exploring Azure Synapse Analytics.mp4

07:50

5. Understanding the dataset.mp4

03:51

4. Serverless SQL Pool

1. Section Introduction.mp4

01:26

2. Serverless SQL Pool - Introduction.mp4

03:24

3. Serverless SQL Pool - Architecture.mp4

03:57

4. Serverless SQL Pool- Benefits and Pricing.mp4

05:27

5.1 Unemployment.csv

5.2 unemployment.zip

5. Uploading files into Azure Datalake Storage.mp4

06:36

6.1 1 data exploration.zip

6.2 Openrowset.html

6. Initial Data Exploration.mp4

14:36

7. How to import SQL scripts or ipynb notebooks to Azure Synapse.mp4

02:58

8.1 2 fixing collation warning.zip

8. Fixing the Collation warning.mp4

09:39

9.1 3 creating external datasource.zip

9. Creating External datasource.mp4

09:13

10.1 4 creating database scoped credential sas.zip

10. Creating database scoped credential Using SAS.mp4

12:23

11.1 5 creating database scoped credential mi.zip

11. Creating Database scoped cred using MI.mp4

08:11

12. Deleting existing data sources for cleanup.mp4

03:51

13. Creating an external file format - Demo.mp4

05:36

14.1 6 create external file format.zip

14. Creating an External File Format - Practical.mp4

02:11

15. Creating External DataSource for Refined container.mp4

01:57

16.1 7 creating external table.zip

16. Creating an External Table.mp4

12:47

17. End of section.mp4

00:39

5. History and Data processing before Spark

1. Section Introduction.mp4

00:56

2. Big Data Approach.mp4

05:51

3. Understanding Hadoop Yarn- Cluster Manager.mp4

05:26

4. Understanding Hadoop - HDFS.mp4

04:19

5. Understanding Hadoop - MapReduce Distributed Computing.mp4

07:11

6. Emergence of Spark

1. Section Introduction.mp4

00:49

2. Drawbacks of MapReduce Framework.mp4

03:24

3. Emergence of Spark.mp4

04:51

7. Spark Core Concepts

1. Section Introduction.mp4

00:51

2. Spark EcoSystem.mp4

06:18

3. Difference between Hadoop & Spark.mp4

03:37

4. Spark Architecture.mp4

02:40

5. Creating a Spark Pool & its benefits.mp4

09:02

6. RDD Overview.mp4

02:48

7. Functions Lambda, Map and Filter - Overview.mp4

04:19

8.1 10 understanding rdd in practical.zip

8. Understanding RDD in practical.mp4

10:53

9. RDD- Lazy loading - Transformations and Actions.mp4

06:40

10. What is RDD Lineage.mp4

05:07

11. RDD - Word count program - Demo.mp4

07:45

12.1 14 word count pyspark program practical.zip

12.2 tonystark.txt

12. RDD - Word count - PySpark Program - Practical.mp4

11:40

13. Optimization - ReduceByKey vs GroupByKey Explanation.mp4

07:36

14. RDD - Understanding about Jobs in spark Practical.mp4

03:44

15. RDD - Understanding Narrow and Wide Transformations.mp4

04:40

16. RDD- Understanding Stages - Practical.mp4

06:48

17.1 18 rdd understanding tasks practical.zip

17. RDD- Understanding Tasks Practical.mp4

06:13

18. Understand DAG , RDD Lineage and Differences.mp4

08:06

19. Spark Higher level APIs Intro.mp4

03:53

20.1 2023-01-15 213417.413947.csv

20.2 2023-01-15 213417.413947.zip

20.3 2023-01-15 213417.413947.zip

20.4 dataframe practical.zip

20. Synapse Notebook - Creating dataframes practical.mp4

16:11

8. PySpark Transformation 1 - Select and Filter functions

1. Introduction for PySpark Transformations.mp4

01:41

2.1 1 walkthough on notebook.zip

2. Walkthrough on Notebook , Markdown cells.mp4

08:38

3.1 Databricks login.html

3.2 Databricks Signup.html

3. Using Free Databricks Community Edition to practise and Save Costs.mp4

06:33

4.1 2 display and show functions.zip

4. Display and show Functions.mp4

10:49

5. Stop Spark Session when not in use.mp4

01:11

6.1 3 select and selectexpr.zip

6. Select and SelectExpr.mp4

13:52

7.1 4 filter function.zip

7. Filter Function.mp4

13:36

8. Organizing notebooks into a folder.mp4

02:04

9. PySpark Transformation 2 - Handling Nulls, Duplicates and aggregation

1.1 1 understanding fillna and nadotfill.zip

1. Understanding fillna and na.fill.mp4

09:05

2.1 2 handling duplicates and dropna.zip

2. Identifying duplicates using Aggregations.mp4

10:25

3.1 2 handling duplicates and dropna.zip

3. Handling Duplicates using dropna.mp4

09:18

4. Organising notebooks into a folder.mp4

00:34

5. Transformations summary of this section.mp4

01:20

10. PySpark Transformation 3 - Data Transformation and Manipulation

1.1 3 data transformation and manipulation.zip

1. withColumn to Create Update columns.mp4

13:49

2.1 3 data transformation and manipulation.zip

2. Transforming and updating column withColumnRenamed.mp4

06:56

11. PySpark 4 - Synapse Spark - MSSparkUtils

1. What is MSSpark Utilities.mp4

02:27

2.1 1 mssparkutils env.zip

2. MSSpark Utils - Env utils.mp4

04:39

3. What is mount point.mp4

03:16

4.1 2 msspark utils fs mount.zip

4. Creating and accessing mount point in Notebook.mp4

10:26

5.1 3 msspark utils fs utils.zip

5. All File System Utils.mp4

14:00

6.1 4 a notebook parent.zip

6. Notebook Utils - Exit command.mp4

04:32

7.1 Synapse Quotas.html

7. Creating another spark pool.mp4

07:46

8.1 To Submit ticket for quota increase.html

8. Procedure to increase vCores request (optional).mp4

01:32

9.1 4 a notebook child.zip

9.2 4 a notebook parent.zip

9. Calling notebook from another notebook.mp4

02:52

10.1 4 a notebook parent para.zip

10. Calling notebook from another using runtime parameters.mp4

07:33

11.1 5 magic commands.zip

11. Magic commands.mp4

06:05

12.1 FAQ.html

12. Attaching two notebooks to a single spark pool.mp4

07:39

13.1 6 1 accessing mount configuration.zip

13.2 6 mount configuration.zip

13. Accessing Mount points from another notebook.mp4

11:19

12. PySpark 5 - Synapse - Spark SQL

1.1 1 accessing data using temporary views practical.zip

1. Accessing data using Temporary Views - Practical.mp4

08:29

2. Lake Database - Overview.mp4

02:41

3.1 2 creating database in lake database.zip

3. Understanding and creating database in Lake Database.mp4

10:51

4.1 2 creating database in lake database.zip

4. Using Spark SQL in notebook.mp4

04:54

5.1 3 managed vs external tables.zip

5. Managed vs External tables in Spark.mp4

13:50

6. Metadata sharing between Spark pool and Serverless SQL Pool.mp4

06:38

7. Deleting unwanted folders.mp4

01:15

13. PySpark Transformation 6 - Join Transformations

1.1 Education and Expected Salary ranges.csv

1.2 Education Details.csv

1.3 Salary Details.csv

1. Uploading required files for Joins.mp4

02:00

2.1 1 understanding joins and union.zip

2. Python notebooks till Union.html

3. Inner join.mp4

08:02

4. Left Join.mp4

02:46

5. Right Join.mp4

02:24

6. Full outer join.mp4

02:43

7. Left Semi Join.mp4

04:02

8. Left anti and Cross Join.mp4

03:28

9. Union Operation.mp4

03:10

10.1 2 performing join transformation.zip

10. Performing Join Transformation on Project Dataset.mp4

05:02

11. Summary of Transformations performed.mp4

01:01

14. PySpark Transformation 7 - String Manipulation and sorting

1. Replace function to change spaces.mp4

04:44

2.1 1 string manipulation and sorting.zip

2. PySpark Notebook for this section.html

3. Split and concat functions.mp4

09:21

4. Order by and sort.mp4

07:30

5. Section Summary.mp4

01:31

15. PySpark Transformation 8 - Window Functions

1. Row number function.mp4

07:54

2.1 1 window functions.zip

2. PySpark Notebook used in this section.html

3. Rank Function.mp4

04:47

4. Dense Rank function.mp4

07:25

16. PySpark Transformation 9 - Conversions and Pivoting

1. Conversion using cast function.mp4

09:09

2.1 1 cast and pivoting.zip

2. PySpark Notebook need for casting and pivoting lectures.html

3. Pivot function.mp4

05:10

4. Unpivot using stack function.mp4

06:07

5.1 2 to date+function.zip

5.2 Databricks - Datetime Patterns.html

5.3 Microsoft Docs - Date time patterns.html

5.4 Microsoft Docs - Datetime.html

5. Using to date to convert date column.mp4

08:51

17. PySpark Transformation 10 - Schema definition and Management

1.1 1 schema definition and management.zip

1. PySpark Notebook used in this lecture.html

2. StructType and StructField - Demo.mp4

03:05

3. Implementing explicit schema with StructType and StructField.mp4

13:31

18. PySpark Transformation 11 - UDFs

1. User Defined Functions - Demo.mp4

03:18

2.1 1 udfs.zip

2. Implementing UDFs in Notebook.mp4

08:48

3.1 1 writing data to processed container.zip

3. Writing transformed data to Processed container.mp4

03:17

19. Dedicated SQL Pool

1. Dedicated SQL pool - Demo.mp4

02:19

2. Dedicated SQL Pool Architecture.mp4

04:24

3. How distribution takes places based on DWU.mp4

05:58

4. Factors to consider when choosing dedicated SQL pool.mp4

02:43

5. Creating Dedicated SQL pool in Synapse.mp4

03:08

6. Ways to copy data into Dedicated SQL Pool.mp4

03:47

7.1 1 copy command to get data into dedicated sql pool.zip

7. Copy command to copy to dedicated SQL pool.mp4

04:55

8. Clustured Column Store index(optional).mp4

02:02

9. Types of Distributions or Sharing patterns.mp4

06:52

10. Using Pipeline to Copy to dedicated SQL Pool.mp4

06:57

20. Reporting data to Power BI

1. Section Introduction.mp4

01:18

2. Installing Power BI Desktop.mp4

01:20

3. Creating report from Power BI Desktop.mp4

04:22

4. Creating new user in Azure AD for creating workspace (if using personal account).mp4

04:31

5. Creating a shared workspace in Power BI.mp4

03:46

6. Publishing report to Shared Workspace.mp4

01:32

7. Accessing Power BI from Azure Synapse Analytics.mp4

04:31

8.1 synapse power bi report.zip

8. Download Power BI .pbix file from here.html

9. Creating Dataset and report from Synapse Analytics.mp4

06:31

10. Concluding the Power BI Section.mp4

02:41

11. Summary and end of project implementation.mp4

02:25

21. Spark - Optimisation Techniques

1. Optimisation Section Intro.mp4

00:56

2.1 cache.csv

2.2 partition.zip

2.3 Unemployment collect.csv

2.4 Unemployment inferschema.csv

2. Uploading required files for Optimisation.mp4

01:45

3. Spark Optimisation levels.mp4

02:48

4.1 1 optimization avoid collect.zip

4. Avoid using Collect function.mp4

07:37

5. Making notebook into particular folder.mp4

01:22

6.1 2 avoid infer schema.zip

6. Avoid InferSchema.mp4

09:34

7. Use Cache Persist 1 - Understanding Serialization and DeSerialization.mp4

06:31

8. Use Cache Persist 2 - How cache or persist will work - Demo.mp4

09:11

9.1 3 cache.zip

9. Use Cache Persist 3 - Understanding cache practically.mp4

09:47

10. Use Cache Persist 4 - Persist - What is persist and different storage levels.mp4

03:59

11.1 4 persist.zip

11.2 storage level notes.zip

11. Use Cache Persist - Notebook for persist with all storage levels.html

12. Use Cache Persist 5 - Persist - MEMORY ONLY.mp4

17:27

13. Use Cache Persist 6 - Persist - MEMORY AND DISK.mp4

08:18

14. Use Cache Persist 7 - Persist - MEMORY ONLY SER (Scala Only).mp4

04:00

15. Use Cache Persist 8 - Persist - MEMORY AND DISK SER ( Scala Only).mp4

02:57

16. Use Cache Persist 9 - Persist - DISK ONLY.mp4

05:41

17. Use Cache Persist 10 - Persist - OFF HEAP (Scala Only).mp4

02:05

18. Use Cache Persist 11 - Persist - MEMORY ONLY 2 (PySpark only).mp4

02:34

19. Use Partitioning 1 - Understanding partitioning - Demo.mp4

05:24

20.1 4 paritioning.zip

20. Use Partitioning 2 - Understand partitioning - Practical.mp4

08:35

21. Repartiton and coalesce 1 - Understanding repartition and coalesce - Demo.mp4

05:51

22. Repartiton and coalesce 2 - Understanding repartition and coalesce - Practical.mp4

06:43

23. Broadcast variables 1 - Understanding broadcast variables - Demo.mp4

06:47

24.1 6 broadcast variables.zip

24. Broadcast variables 2 - Implementing broadcast variables in notebook.mp4

05:53

25. Use Kryo Serializer.mp4

03:10

22. Delta Lake

1. Section Introduction.mp4

00:48

2. Drawbacks of ADLS.mp4

06:08

3. What is Delta lake.mp4

02:00

4. Lakehouse Architecture.mp4

06:21

5.1 SchemaManagementDelta.csv

5. Uploading required file for Delta lake.mp4

01:32

6.1 1 problems in data lake and creating delta lake.zip

6. Problems with Azure Datalake - Practical.mp4

08:23

7. Creating a Delta lake.mp4

03:56

8. Understanding Delta format.mp4

04:50

9.1 2 understanding transaction log file.zip

9. Contents of Transaction Log or Delta log file - Practical.mp4

18:15

10. Contents of a transaction log demo.mp4

03:44

11.1 3 creating delta tables using sql by path.zip

11. Creating delta table by Path using SQL.mp4

21:20

12.1 4 creating delta table in metastore pyspark and sql.zip

12. Creating delta table in Metastore using Pyspark and SQL.mp4

07:30

13.1 lesscols.zip

13.2 SchemaDifferDataType.csv

13.3 schemaextracolumn1.zip

13. Schema Enforcement - Files required for Understanding Schema Enforcement -.mp4

00:39

14. What is schema enforcement - Demo.mp4

05:00

15.1 4 creating delta table in metastore pyspark and sql.zip

15. Schema Enforcement - Practical.mp4

08:00

16.1 4 creating delta table in metastore pyspark and sql.zip

16. Schema Evolution - Practical.mp4

05:52

17.1 6 versioning and time travel.zip

17. 16. Versioning and Time Travel.mp4

19:13

18.1 7 vacuum command.zip

18. Vacuum command.mp4

13:41

19.1 8 convert to delta lake and checkpoints.zip

19. Convert to Delta command.mp4

06:29

20.1 8 convert to delta lake and checkpoints.zip

20. Checkpoints in delta log.mp4

06:48

21. Optimize command - Demo.mp4

08:27

22.1 9 optimize command.zip

22. Optimize command - Practical.mp4

15:35

23.1 10 - upsert using merge command.zip

23. Applying UPSERT using MERGE Command.mp4

09:37

23. Conclusion

1. Course Conclusion.mp4

01:14

2. Bonus Lecture.html

Description

Build complete project only with Azure Synapse Analytics focused on PySpark includes delta lake and spark Optimizations

What You'll Learn?

Understand Azure Synapse Analytics Services Practically
Complete basic to advanced understanding on Azure Synapse Analytics
Gain hands-on experience in applying Spark optimization techniques to real-world scenarios, achieving faster insights.
Understand 50+ most commonly used PySpark Transformations
Acquire a comprehensive library of 45+ PySpark notebooks for data cleansing, enrichment, and transformation.
Hands-on learning on building a modern data warehouse using Azure Synapse
Explore the capabilities of Spark Pools and their role in processing large-scale data workloads
Understand how python is used in Data Engineering
Understand and transform data with Serverless SQL pool
Understand the principles and advantages of Delta Lake as a reliable data storage and management solution.
Explore the capabilities of Spark Pools and their role in processing large-scale data workloads
Learn How Spark is evolved and its growth
Provides insights on services that needed to clear DP-203
Create and configure a Serverless SQL pool
Create External DataSource, External Files, External Tables in Serverless SQL pool
Configure Spark Pools and understand the working of them
Explore the capabilities of Spark Pools and their role in processing large-scale data workloads
Understand the Integration of Power BI with Azure Synapse Analytics
Explore the capabilities of Spark Pools and their role in processing large-scale data workloads
Create and work with Dedicated SQL pool on a high level
Optimize your PySpark with Spark Optimization techniques
Learn history and data processing before Spark
Implement the incremental UPSERT using Delta Lake
Understand and implement versioning in delta lake
Implement MSSpark Utils and the uses of its utilities
How we can mount Data lake to Synapse Notebooks

Who is this for?

Beginners who want to step into the world of Data Engineers

Professional Data Engineers who want to advance their data analysis skills

Students who are keen to learn Data Analytics

Data Engineers who want to learn data warehousing in Cloud using Azure Synapse Analytics

What You Need to Know?

No Azure Synapse Analytics experience needed. You will learning everything you needed

Basics of Python programming

Basics of SQL language

More details

Description
Are you ready to revolutionize your data analytics skills? Look no further. Welcome to our comprehensive course, where you'll delve deep into the world of Azure Synapse Analytics with PySpark and emerge equipped with the tools to excel in modern data analysis.
Unlock the Power of Azure Synapse Analytics!
18.5+ HOURS OF IN-DEPTH LEARNING CONTENT!
In this course we will be learning about :
Serverless SQL Pool - Perform flexible querying for structured and initial data exploration
Spark Pools - Dive into advanced data processing and analytics with the power of Apache Spark.
Spark SQL - Seamlessly query structured data using Spark's SQL capabilities.
MSSpark Utils - Leverage MSSpark Utilities for enhanced Spark functionalities for Synapse/
50+ PySpark Transformations - Harness over 50 PySpark transformations to manipulate and refine your data.
Dedicated SQL Pool - To report data efficiently to Power BI.
Integrating Power BI with Azure Synapse Analytics - Seamlessly connect Power BI for enriched data visualization and insights.
Delta Lake and its features - Integrate Delta Lake for reliable, ACID-compliant data.
Spark Optimization Techniques - Employ optimization techniques to enhance Spark processing speed and efficiency.

You will also learn how python is helpful in data analysis. Our project-based approach ensures hands-on learning, giving you the practical experience needed to conquer real-world data challenges.
While this course not completely focuses on certification you can also learn the practical understanding about Azure Synapse analytics service that is needed to pass DP-203 - "Microsoft Certified Azure Data Engineer" and DP-500 "Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI"

Join with me in mastering Azure Synapse Analytics !
Who this course is for:
Beginners who want to step into the world of Data Engineers
Professional Data Engineers who want to advance their data analysis skills
Students who are keen to learn Data Analytics
Data Engineers who want to learn data warehousing in Cloud using Azure Synapse Analytics

User Reviews

Rating

average 0

Total votes0

Focused display

An Azure Data Engineer and having vast experience on Azure Data Engineering Services and building ETL Pipelines. I have developed expertise in managing large-scale data solutions on the Microsoft Azure cloud platform. My knowledge and experience in Azure services, such as Azure Data Factory, Azure Synapse , and other data engineering services in Azure, enable me to design and implement robust data pipelines and optimize data processing workflows.In addition to my work as a data engineer, I am also a passionate blogger and instructor at Udemy. Through my blog and online courses, I share my insights and knowledge in data engineering and related topics with 200+ students on Udemy, helping them to build their skills and knowledge in the field.As a data professional, I am committed to continuous learning and staying up-to-date with the latest industry trends and technologies/With passion on learning Cloud Technologies with hands-on learning and Certified with- Microsoft Azure Data Engineer (DP-203)- Microsoft Certified Power BI Data Analyst (PL-300)- Microsoft Certified Azure Administrator (AZ-104)- Data bricks Certified Lakehouse Fundamentals- AWS Certified Solutions Architect - Associate- AWS Certified Cloud Practitioner- Microsoft Certified Azure Fundamentals (AZ-900)- Microsoft Certified Azure Data Fundamentals (DP-900)- Microsoft Certified Azure Security, Compliance, and Identity Fundamentals (SC-900)"Evolve ourselves along with the trending technology by learning and enhance the skill set to master it"

Udemy

View courses Udemy

Students take courses primarily to improve job-related skills.Some courses generate credit toward technical certification. Udemy has made a special effort to attract corporate trainers seeking to create coursework for employees of their company.