This introduces Apache Iceberg, a high-performance open table format for organizing petabyte-scale analytic datasets on a file system or object store, available on Cloudera Data Warehouse and Cloudera Data Engineering on both Private and Public Cloud. Combined with Cloudera Data Platform, Iceberg can enable users to build an open data lake house architecture for multi-function analytics and to deploy large-scale end-to-end pipelines. This course covers various aspects of Apache Iceberg, such as benefits, architecture, internal operation, read and write operations, and advanced functions, all while drawing comparisons to Hive and building on the students’ existing knowledge and experience.
Course Objectives:
In this course, you will learn to:
- Gain a deep understanding of Iceberg's benefits, snapshots, and their functionalities.
- Confidently build external and managed tables, configuring copy-on-write and merge-on-read for optimized data management.
- Perform rollbacks and time travel, navigate schema and partition evolution, and utilize hidden partitions.
- Create and merge table branches, mastering Iceberg's write-audit-publish procedure.
- Efficiently perform table maintenance tasks and tackle data migration challenges.
Course content
Introduction
- Apache Hive
- Why Iceberg?
- Data Lakehouses
- What is Iceberg?
Catalogs
- Review Iceberg Catalog Configuration
Iceberg Concepts
- Snapshots
- Metadata Layer: Manifest List, Manifest Files
- Time Travel
- Schema Evolution
- Hidden Partition
- Write-Audit-Publish (WAP)
- Branches, Tags, Zero-Copy-Clone
Iceberg Table Design
- Managed & External Tables
- Table Properties Review
- Copy-On-Write (COW) vs Merge-On-Read (MOR)
- Hidden Partitions
- Compare Hive vs Iceberg Partition Design
- Table Metadata
- Table Maintenance
Data-As-Code
- Iceberg Personas
- Write-Audit-Publish (WAP)
- Branches & Tagging
Hive-to-Iceberg Table Migration
- In-place Migration
- Shallow Migration
To see the full course content Download now
Course Prerequisites
- A general knowledge of HDFS and experience with Hive and Spark are required.
Who can attend
- This course is for new and existing customers using Cloudera Data Warehouse or Cloudera Data Engineering on Private or Public Cloud who are interested in benefiting from using Apache Iceberg.
- The course is designed for Data Engineers, Hive SQL Developers, Kafka Streaming Engineers, Data Scientists, and CDP Admins.
Number of Hours: 25hrs
Certification
NoneKey features
- One to One Training
- Online Training
- Fastrack & Normal Track
- Resume Modification
- Mock Interviews
- Video Tutorials
- Materials
- Real Time Projects
- Virtual Live Experience
- Preparing for Certification
FAQs
DASVM Technologies offers 300+ IT training courses with 10+ years of Experienced Expert level Trainers.
- One to One Training
- Online Training
- Fastrack & Normal Track
- Resume Modification
- Mock Interviews
- Video Tutorials
- Materials
- Real Time Projects
- Materials
- Preparing for Certification
Call now: +91-99003 49889 and know the exciting offers available for you!
We working and coordinating with the companies exclusively to get placed. We have a placement cell focussing on training and placements in Bangalore. Our placement cell help more than 600+ students per year.
Learn from experts active in their field, not out-of-touch trainers. Leading practitioners who bring current best practices and case studies to sessions that fit into your work schedule. We have a pool of experts and trainers are composed with highly skilled and experienced in supporting you in specific tasks and provide professional support. 24x7 Learning support from mentors and a community of like-minded peers to resolve any conceptual doubts. Our trainers has contributed in the growth of our clients as well as professionals.
All of our highly qualified trainers are industry experts with at least 10-12 years of relevant teaching experience. Each of them has gone through a rigorous selection process which includes profile screening, technical evaluation, and a training demo before they are certified to train for us. We also ensure that only those trainers with a high alumni rating continue to train for us.
No worries. DASVM technologies assure that no one misses single lectures topics. We will reschedule the classes as per your convenience within the stipulated course duration with all such possibilities. If required you can even attend that topic with any other batches.
DASVM Technologies provides many suitable modes of training to the students like:
- Classroom training
- One to One training
- Fast track training
- Live Instructor LED Online training
- Customized training
Yes, the access to the course material will be available for lifetime once you have enrolled into the course.
You will receive DASVM Technologies recognized course completion certification & we will help you to crack global certification with our training.
Yes, DASVM Technologies provides corporate trainings with Course Customization, Learning Analytics, Cloud Labs, Certifications, Real time Projects with 24x7 Support.
Yes, DASVM Technologies provides group discounts for its training programs. Depending on the group size, we offer discounts as per the terms and conditions.
We accept all major kinds of payment options. Cash, Card (Master, Visa, and Maestro, etc), Wallets, Net Banking, Cheques and etc.
DASVM Technologies has a no refund policy. Fees once paid will not be refunded. If the candidate is not able to attend a training batch, he/she is to reschedule for a future batch. Due Date for Balance should be cleared as per date given. If in case trainer got cancelled or unavailable to provide training DASVM will arrange training sessions with other backup trainer.
Your access to the Support Team is for lifetime and will be available 24/7. The team will help you in resolving queries, during and after the course.
Please Contact our course advisor +91-99003 49889. Or you can share your queries through info@dasvmtechnologies.com