Databricks with Data Engineering is a comprehensive course designed to equip participants with the essential skills and knowledge to excel in the field of data engineering using the powerful Databricks platform. This course provides an introduction to data engineering with Databricks, covering key tools and frameworks such as Delta Lake, Databricks Workflows, Delta Live Tables, and Unity Catalog. Participants will learn how to ingest, transform, and manage data using Delta Lake, deploy workloads with Databricks Workflows, build efficient pipelines with Delta Live Tables, and apply data governance principles using Unity Catalog. The course includes hands-on labs and real-world applications to ensure learners develop practical skills for working with Databricks effectively.
Course Objectives:
In this course, you will learn to:
- Ingest, transform, and manage data using Delta Lake.
- Deploy and monitor data workloads with Databricks Workflows.
- Build scalable data pipelines using Delta Live Tables and the Medallion Architecture.
- Apply data governance principles and manage permissions using Unity Catalog.
- Troubleshoot, optimise, and monitor data workflows in Databricks.
Course content
Data Ingestion with Delta Lake
- Delta Lake and Data Objects
- Set Up and Load Delta Tables
- Basic Transformations
- Load Data Lab
- Cleaning Data
- Complex Transformations
- SQL UDFs
- Advanced Delta Lake Features
- Manipulate Delta Tables Lab
Deploy Workloads with Databricks Workflows
- Introduction to Workflows
- Jobs Compute
- Scheduling Tasks with the Jobs UI
- Workflows Lab
- Jobs Features
- Explore Scheduling Options
- Conditional Tasks and Repairing Runs
- Modular Orchestration
- Databricks Workflows Best Practices
Build Data Pipelines with Delta Live Tables
- The Medallion Architecture
- Introduction to Delta Live Tables
- Using the Delta Live Tables UI
- SQL Pipelines
- Python Pipelines
- Delta Live Tables Running Modes
- Pipeline Results
- Pipeline Event Logs
- Optional – Land New Data
Data Management and Governance with Unity Catalog
- Data Governance Overview
- Demo: Populating the Metastore
- Lab: Navigating the Metastore
- Organization and Access Patterns
- Demo: Upgrading Tables to Unity Catalog
- Security and Administration in Unity Catalog
- Databricks Marketplace Overview
- Privileges in Unity Catalog
- Demo: Controlling Access to Data
- Fine-Grained Access Control
- Lab: Migrating and Managing Data in Unity Catalog
Databricks Streaming and Delta Live Tables
- Streaming Data Concepts
- Introduction to Structured Streaming
- Demo: Reading from a Streaming Query
- Streaming from Delta Lake
- Lab: Streaming Query Lab
- Aggregation, Time Windows, Watermarks
- Event Time + Aggregatios over Time Windows
- Lab: Stream Aggregation Lab
- Demo: Windowed Aggregation with Watermark
- Data Ingestion Pattern
- Demo: Auto Load to Bronze
- Demo: Stream from Multiplex Bronze
- Quality Enforcement Pattern
- Demo: Quality Enforcement
- Lab: Streaming ETL Lab
Databricks Data Privacy
- Regulatory Compliance
- Data Privacy
- Key Concepts and Components
- Audit Your Data
- Data Isolation
- Demo: Securing Data in Unity Catalog
- Pseudonymization & Anonymization
- Summary & Best Practices
- Demo: PII Data Security
- Capturing Changed Data
- Deleting Data in Databricks
- Demo: Processing Records from CDF and Propagating Changes
- Lab: Propagating Changes with CDF Lab
Databricks Performance Optimization
- DevOps Spark UI Introduction
- Introduction to Designing Foundation
- Demo: File Explosion
- Data Skipping and Liquid Clustering
- Lab: Data Skipping and Liquid Clustering
- Skew
- Shuffles
- Demo: Shuffle
- Spill
- Lab: Exploding Join
- Serialization
- Demo: User-Defined Functions
- Fine-Tuning: Choosing the Right Cluster
- Pick the Best Instance Types
Automated Deployment with Databricks Asset Bundles
- DevOps Review
- Continuous Integration and Continuous Deployment/Delivery (CI/CD) Review
- Demo: Course Setup and Authentication
- Deploying Databricks Projects
- Introduction to Databricks Asset Bundles (DABs)
- Demo: Deploying a Simple DAB
- Lab: Deploying a Simple DAB
- Variable Substitutions in DABs
- Demo: Deploying a DAB to Multiple Environments
- Lab: Deploy a DAB to Multiple Environments
- DAB Project Templates Overview
- Lab: Use a Databricks Default DAB Template
- CI/CD Project Overview with DABs
- Demo: Continuous Integration and Continuous Deployment with DABs
- Lab: Adding ML to Engineering Workflows with DABs
- Developing Locally with Visual Studio Code (VSCode)
- Demo: Using VSCode with Databricks
- CI/CD Best Practices for Data Engineering
- Next Steps: Automated Deployment with GitHub Actions
To see the full course content Download now
Course Prerequisites
- Beginner familiarity with basic cloud concepts (virtual machines, object storage, identity management)
- Ability to perform basic code development tasks (create compute, run code in notebooks, use basic notebook operations, import repos from git, etc.)
- Intermediate familiarity with basic SQL concepts (CREATE, SELECT, INSERT, UPDATE, DELETE, WHILE, GROUP BY, JOIN, etc.)
- Intermediate experience with basic SQL concepts such as SQL commands, aggregate functions, filters and sorting, indexes, tables, and views.
- Basic knowledge of Python programming, jupyter notebook interface, and PySpark fundamentals.
- A basic understanding of Git version control.
Who can attend
- Data engineers
- Data analysts
- Data scientists
- Professionals involved in designing, building, and maintaining data pipelines.
- Professionals working on data storage solutions.
- Anyone looking to work in the field of data engineering.
- Those seeking to leverage the power of Databricks for efficient data processing.
Number of Hours: 30hrs
Certification
Key features
- One to One Training
- Online Training
- Fastrack & Normal Track
- Resume Modification
- Mock Interviews
- Video Tutorials
- Materials
- Real Time Projects
- Virtual Live Experience
- Preparing for Certification
FAQs
DASVM Technologies offers 300+ IT training courses with 10+ years of Experienced Expert level Trainers.
- One to One Training
- Online Training
- Fastrack & Normal Track
- Resume Modification
- Mock Interviews
- Video Tutorials
- Materials
- Real Time Projects
- Materials
- Preparing for Certification
Call now: +91-99003 49889 and know the exciting offers available for you!
We working and coordinating with the companies exclusively to get placed. We have a placement cell focussing on training and placements in Bangalore. Our placement cell help more than 600+ students per year.
Learn from experts active in their field, not out-of-touch trainers. Leading practitioners who bring current best practices and case studies to sessions that fit into your work schedule. We have a pool of experts and trainers are composed with highly skilled and experienced in supporting you in specific tasks and provide professional support. 24x7 Learning support from mentors and a community of like-minded peers to resolve any conceptual doubts. Our trainers has contributed in the growth of our clients as well as professionals.
All of our highly qualified trainers are industry experts with at least 10-12 years of relevant teaching experience. Each of them has gone through a rigorous selection process which includes profile screening, technical evaluation, and a training demo before they are certified to train for us. We also ensure that only those trainers with a high alumni rating continue to train for us.
No worries. DASVM technologies assure that no one misses single lectures topics. We will reschedule the classes as per your convenience within the stipulated course duration with all such possibilities. If required you can even attend that topic with any other batches.
DASVM Technologies provides many suitable modes of training to the students like:
- Classroom training
- One to One training
- Fast track training
- Live Instructor LED Online training
- Customized training
Yes, the access to the course material will be available for lifetime once you have enrolled into the course.
You will receive DASVM Technologies recognized course completion certification & we will help you to crack global certification with our training.
Yes, DASVM Technologies provides corporate trainings with Course Customization, Learning Analytics, Cloud Labs, Certifications, Real time Projects with 24x7 Support.
Yes, DASVM Technologies provides group discounts for its training programs. Depending on the group size, we offer discounts as per the terms and conditions.
We accept all major kinds of payment options. Cash, Card (Master, Visa, and Maestro, etc), Wallets, Net Banking, Cheques and etc.
DASVM Technologies has a no refund policy. Fees once paid will not be refunded. If the candidate is not able to attend a training batch, he/she is to reschedule for a future batch. Due Date for Balance should be cleared as per date given. If in case trainer got cancelled or unavailable to provide training DASVM will arrange training sessions with other backup trainer.
Your access to the Support Team is for lifetime and will be available 24/7. The team will help you in resolving queries, during and after the course.
Please Contact our course advisor +91-99003 49889. Or you can share your queries through info@dasvmtechnologies.com