Course content
Applying site reliability engineering principles to a service
1.1 Balance change, velocity, and reliability of the service:
- Discover SLIs (availability, latency, etc.)
- Define SLOs and understand SLAs
- Agree to consequences of not meeting the error budget
- Construct feedback loops to decide what to build next
- Toil automation
1.2 Manage service life cycle:
- Manage a service (e.g., introduce a new service, deploy it, maintain and retire it)
- Plan for capacity (e.g., quotas and limits management)
1.3 Ensure healthy communication and collaboration for operations:
- Prevent burnout (e.g., set up automation processes to prevent burnout)
- Foster a learning culture
- Foster a culture of blamelessness
Building and implementing CI/CD pipelines for a service
2.1 Design CI/CD pipelines:
- Immutable artifacts with Container Registry
- Artifact repositories with Container Registry
- Deployment strategies with Cloud Build, Spinnaker
- Deployment to hybrid and multi-cloud environments with Anthos, Spinnaker, Kubernetes
- Artifact versioning strategy with Cloud Build, Container Registry
- CI/CD pipeline triggers with Cloud Source Repositories, Cloud Build GitHub App, Cloud Pub/Sub
- Testing a new version with Spinnaker
- Configure deployment processes (e.g., approval flows)
2.2 Implement CI/CD pipelines:
- CI with Cloud Build
- CD with Cloud Build
- Open source tooling (e.g. Jenkins, Spinnaker, GitLab, Concourse)
- Auditing and tracing of deployments (e.g., CSR, Cloud Build, Cloud Audit Logs)
2.3 Manage configuration and secrets:
- Secure storage methods
- Secret rotation and config changes
2.4 Manage infrastructure as code:
- Terraform / Cloud Deployment Manager
- Infrastructure code versioning
- Make infrastructure changes safer
- Immutable architecture
2.5 Deploy CI/CD tooling:
- Centralized tools vs. multiple tools (single vs multi-tenant)
- Security of CI/CD tooling
2.6 Manage different development environments (e.g., staging, production, etc.):
- Decide on the number of environments and their purpose
- Create environments dynamically per feature branch with GKE, Cloud Deployment Manager
- Local development environments with Docker, Cloud Code, Skaffold
2.7 Secure the deployment pipeline:
- Vulnerability analysis with Container Registry
- Binary Authorization
- IAM policies per environment
Implementing service monitoring strategies
3.1 Manage application logs:
- Collecting logs from Compute Engine, GKE with Stackdriver Logging, Fluentd
- Collecting third-party and structured logs with Stackdriver Logging, Fluentd
- Sending application logs directly to Stackdriver API with Stackdriver Logging
3.2 Manage application metrics with Stackdriver Monitoring:
- Collecting metrics from Compute Engine
- Collecting GKE/Kubernetes metrics
- Use metric explorer for ad hoc metric analysis
3.3 Manage Stackdriver Monitoring platform:
- Creating a monitoring dashboard
- Filtering and sharing dashboards
- Configure third-party alerting in Stackdriver Monitoring (i.e., PagerDuty, Slack, etc.)
- Define alerting policies based on SLIs with Stackdriver Monitoring
- Automate alerting policy definition with Cloud DM or Terraform
- Implementing SLO monitoring and alerting with Stackdriver Monitoring
- Understand Stackdriver Monitoring integrations (e.g., Grafana, BigQuery)
- Using SIEM tools to analyze audit/flow logs (e.g., Splunk, Datadog)
- Design Stackdriver Workspace strategy
3.4 Manage Stackdriver Logging platform:
- Enabling data access logs (e.g., Cloud Audit Logs)
- Enabling VPC flow logs
- Viewing logs in the GCP Console
- Using basic vs. advanced logging filters
- Implementing logs-based metrics
- Understanding the logging exclusion vs. logging export
- Selecting the options for logging export
- Implementing a project-level / org-level export
- Viewing export logs in Cloud Storage and BigQuery
- Sending logs to an external logging platform
3.5 Implement logging and monitoring access controls:
- Set ACL to restrict access to audit logs with IAM, Stackdriver Logging
- Set ACL to restrict export configuration with IAM, Stackdriver Logging
- Set ACL to allow metric writing for custom metrics with IAM, Stackdriver Monitoring
Optimizing service performance
4.1 Identify service performance issues:
- Evaluate and understand user impact (Stackdriver Service Monitoring for App Engine, Istio)
- Utilize Stackdriver to identify cloud resource utilization
- Utilize Stackdriver Trace/Profiler to profile performance characteristics
- Interpret service mesh telemetry
- Troubleshoot issues with the image/OS
- Troubleshoot network issues (e.g., VPC flow logs, firewall logs, latency, view network details)
4.2 Debug application code:
- Application instrumentation
- Stackdriver Debugger
- Stackdriver Logging
- Stackdriver Trace
- Debugging distributed applications
- App Engine local development server
- Stackdriver Error Reporting
- Stackdriver Profiler
4.3 Optimize resource utilization:
- Identify resource costs
- Identify resource utilization levels
- Develop plan to optimize areas of greatest cost or lowest utilization
- Manage preemptible VMs
- Work with committed-use discounts
- TCO considerations
- Consider network pricing
Managing service incidents
5.1 Coordinate roles and implement communication channels during a service incident:
- Define roles (incident commander, communication lead, operations lead)
- Handle requests for impact assessment
- Provide regular status updates, internal and external
- Record major changes in incident state (When mitigated? When all clear? etc.)
- Establish communications channels (email, IRC, Hangouts, Slack, phone, etc.)
- Scaling response team and delegation
- Avoid exhaustion / burnout
- Rotate / hand over roles
- Manage stakeholder relationships
5.2 Investigate incident symptoms impacting users with Stackdriver IRM:
- Identify probable causes of service failure
- Evaluate symptoms against probable causes; rank probability of cause based on observed behavior
- Perform investigation to isolate most likely actual cause
- Identify alternatives to mitigate issue
5.3 Mitigate incident impact on users:
- Roll back release
- Drain / redirect traffic
- Turn off experiment
- Add capacity
5.4 Resolve issues (e.g., Cloud Build, Jenkins):
- Code change / fix bug
- Verify fix
- Declare all-clear
5.5 Document issue in a postmortem:
- Document root causes
- Create and prioritize action items
- Communicate postmortem to stakeholders
To see the full course content Download now
Course Prerequisites
- Although it is a professional-level certification, there are no prerequisites for taking the Professional Cloud DevOps Engineer exam.
- It is, however, a comprehensive and technical exam that will require you to have experience and/or have done extensive studies on DevOps and Site Reliability Engineering (SRE) with GC.
Who can attend
- 3+ years of industry experience including 1+ years managing solutions on GCP.
- As a DevOps engineer should be responsible for efficient development operations that can balance reliability and delivery speed.
Number of Hours: 40hrs
Certification
Key features
- One to One Training
- Online Training
- Fastrack & Normal Track
- Resume Modification
- Mock Interviews
- Video Tutorials
- Materials
- Real Time Projects
- Virtual Live Experience
- Preparing for Certification
FAQs
DASVM Technologies offers 300+ IT training courses with 10+ years of Experienced Expert level Trainers.
- One to One Training
- Online Training
- Fastrack & Normal Track
- Resume Modification
- Mock Interviews
- Video Tutorials
- Materials
- Real Time Projects
- Materials
- Preparing for Certification
Call now: +91-99003 49889 and know the exciting offers available for you!
We working and coordinating with the companies exclusively to get placed. We have a placement cell focussing on training and placements in Bangalore. Our placement cell help more than 600+ students per year.
Learn from experts active in their field, not out-of-touch trainers. Leading practitioners who bring current best practices and case studies to sessions that fit into your work schedule. We have a pool of experts and trainers are composed with highly skilled and experienced in supporting you in specific tasks and provide professional support. 24x7 Learning support from mentors and a community of like-minded peers to resolve any conceptual doubts. Our trainers has contributed in the growth of our clients as well as professionals.
All of our highly qualified trainers are industry experts with at least 10-12 years of relevant teaching experience. Each of them has gone through a rigorous selection process which includes profile screening, technical evaluation, and a training demo before they are certified to train for us. We also ensure that only those trainers with a high alumni rating continue to train for us.
No worries. DASVM technologies assure that no one misses single lectures topics. We will reschedule the classes as per your convenience within the stipulated course duration with all such possibilities. If required you can even attend that topic with any other batches.
DASVM Technologies provides many suitable modes of training to the students like:
- Classroom training
- One to One training
- Fast track training
- Live Instructor LED Online training
- Customized training
Yes, the access to the course material will be available for lifetime once you have enrolled into the course.
You will receive DASVM Technologies recognized course completion certification & we will help you to crack global certification with our training.
Yes, DASVM Technologies provides corporate trainings with Course Customization, Learning Analytics, Cloud Labs, Certifications, Real time Projects with 24x7 Support.
Yes, DASVM Technologies provides group discounts for its training programs. Depending on the group size, we offer discounts as per the terms and conditions.
We accept all major kinds of payment options. Cash, Card (Master, Visa, and Maestro, etc), Wallets, Net Banking, Cheques and etc.
DASVM Technologies has a no refund policy. Fees once paid will not be refunded. If the candidate is not able to attend a training batch, he/she is to reschedule for a future batch. Due Date for Balance should be cleared as per date given. If in case trainer got cancelled or unavailable to provide training DASVM will arrange training sessions with other backup trainer.
Your access to the Support Team is for lifetime and will be available 24/7. The team will help you in resolving queries, during and after the course.
Please Contact our course advisor +91-99003 49889. Or you can share your queries through info@dasvmtechnologies.com