
Business Intelligence with Databricks SQL: Concepts, tools, and techniques for scaling business intelligence on the data lakehouse
Author
Publication
Packt Publishing
In this new era of data platform system design, data lakes and data warehouses are giving way to the lakehouse a new type of data platform system that aims to unify all data analytics into a single platform. Databricks, with its Databricks SQL product suite, is the hottest lakehouse platform out there, harnessing the power of Apache Spark, Delta Lake, and other innovations to enable data warehousing capabilities on the lakehouse with data lake economics.
This book is a comprehensive hands-on guide that helps you explore all the advanced features, use cases, and technology components of Databricks SQL. You'll start with the lakehouse architecture fundamentals and understand how Databricks SQL fits into it. The book then shows you how to use the platform, from exploring data, executing queries, building reports, and using dashboards through to learning the administrative aspects of the lakehouse data security, governance, and management of the computational power of the lakehouse. You'll also delve into the core technology enablers of Databricks SQL Delta Lake and Photon. Finally, you'll get hands-on with advanced SQL commands for ingesting data and maintaining the lakehouse.
By the end of this book, you'll have mastered Databricks SQL and be able to deploy and deliver fast, scalable business intelligence on the lakehouse.
Review
"The book provides a good overview of lakehouse architecture and focuses specifically on business intelligence use cases of lakehouse with features and functions of Databricks SQL.
If you're looking for an extensive hands-on guide on Databricks SQL, this is it. A great resource to get started with DBSQL but also master your knowledge."
--Oleksandra Bovkun, Solutions Architect at Databricks
"The author is an accomplished practitioner and has put together a very comprehensive yet easy reading book detailing how to master SQL on the Databricks Lakehouse Platform. If you are a data/business intelligence practitioner then this is the asset you need to not only learn the concepts but rather to master them. Recommended without reservation, great job Vihag Gupta!"
--Nick Eayrs, Vice President, Field Engineering APJ at Databricks
About the Author
Vihag Gupta is a solutions architect with a specialization in cloud data platform architecture and design. He has a background in data engineering and a professional interest in machine learning. He loves getting hands-on and solving real business problems with technology. He graduated with a degree in information technology from PES University, Bengaluru, in 2011 and earned a degree in information systems management from Carnegie Mellon University, Pittsburgh, in 2016. He has worked at companies including Deloitte Consulting, DataSpark, and Qubole. He currently works at Databricks, helping clients bring their lakehouse platforms for analytics to life.
Originally from Jharkhand, India, Vihag currently lives in Singapore with his wife and dog.
- Understand how Databricks SQL fits into the Databricks Lakehouse Platform
- Perform everyday analytics with Databricks SQL Workbench and business intelligence tools
- Organize and catalog your data assets
- Program the data security model to protect and govern your data
- Tune SQL warehouses (computing clusters) for optimal query experience
- Tune the Delta Lake storage format for maximum query performance
- Deliver extreme performance with the Photon query execution engine
- Implement advanced data ingestion patterns with Databricks SQL
This book is for business intelligence practitioners, data warehouse administrators, and data engineers who are new to Databrick SQL and want to learn how to deliver high-quality insights unhindered by the scale of data or infrastructure. This book is also for anyone looking to study the advanced technologies that power Databricks SQL. Basic knowledge of data warehouses, SQL-based analytics, and ETL processes is recommended to effectively learn the concepts introduced in this book and appreciate the innovation behind the platform.
- Introduction to Databricks
- The Databricks Product Suite A Visual Tour
- The Data Catalog
- The Security Model
- The Workbench
- The SQL Warehouses
- Using Business Intelligence Tools with Databricks SQL
- The Delta Lake
- The Photon Engine
- Warehouse on the Lakehouse
- SQL Commands Part 1
- SQL Commands Part 2
- Playing with the TPC-DS Dataset
- Ask Me Anything