Google Professional Cloud DevOps Engineer Certification 2021

Deal Score+1
FREE $9.99 Go To Course
Deal Score+1
FREE $9.99 Go To Course


Google Cloud Platform has emerged out to be one of the universally recognized cloud platforms. In no time, it has successfully managed to give high competence to the already existing giants in the cloud platform – Amazon Web Services and Microsoft Azure. Google Cloud platform has achieved the highest status, and the Google Professional Cloud DevOps Engineer certification is greatly recommended for its applications in analytics, machine learning, and cloud-native.

Who should take the Google Professional Cloud DevOps Engineer Exam?

The chief perspective of the Google Professional Cloud Devops Engineer (GCP) Exam is to screen the calibre and proficiency of professionals in applying the tactics of the cloud platform. Candidates are screened in handling the responsibility in the efficient development of operations, and to balance service reliability and delivery speed. Candidates should ace at the application of Google Cloud Platform in building software delivery pipelines, deploying and monitoring services, and managing and learning from incidents.

The Professional Cloud DevOps Engineer Exam is primarily designed for these professionals –

  • On-premises IT system administrators
  • Cloud solution architects and application developers
  • DevOps professionals with industry experience
  • Aspiring DevOps professionals with limited GCP experience
  • On-premise system engineers

Expertise Validated

Professional Cloud DevOps Engineer is the Best Cloud Engineer Certification as it validates the expertise of the candidates in –

  • Applying site reliability engineering principles to any service
  • Optimizing service performance
  • Implementing service monitoring strategies
  • Building and implementing CI/CD pipelines for a service
  • Managing service incidents

About Google Cloud Platform Professional Cloud Devops Engineer

Exam Terms and Conditions


Google considers the disclosure of Confidential Information as clear a violation of its Terms. Such a reported violation can compromise the security, and integrity of Google’s certification programs. The exams are made available to candidates with the sole purpose of demonstrating their skills, and competency in that particular area.

Any such violation of these Terms will lead to prohibition from taking any Google Certification Exam. Moreover, Google holds the right to decertify you, and it may, in its sole discretion, terminate any business relationship with you, prohibiting access to its exam services.

Certification Renewal / Recertification

You are supposed to get yourself recertified for the maintenance of your certification status. Google Cloud certifications are valid only for a period of two years, unless otherwise explicitly stated in the exam descriptions. Recertification attempts can be taken 60 days prior to the expiration date of your certification.

Topic 1: Applying site reliability engineering principles to a service

1.1 Balance change, velocity, and reliability of the service:

  • Discover SLIs (availability, latency, etc.) (Google Documentation: SRE fundamentals: SLIs, SLAs, and SLOs, Service Level Objectives)
  • Define SLOs and understand SLAs (Google Documentation: Defining SLOs, SRE fundamentals: SLIs, SLAs, and SLOs, Service Level Objectives)
  • Agree to the consequences of not meeting the error budget (Google Documentation: Consequences of SLO violations—CRE life lessons, Understanding error budget overspend: part one)
  • Construct feedback loops to decide what to build next
  • Toil automation (Google Documentation: Eliminating Toil, Identifying and tracking toil using SRE principles)

1.2 Manage service life cycle:

  • Manage a service (e.g., introduce a new service, deploy it, maintain and retire it) (Google Documentation: Managing Services)
  • Plan for capacity (e.g., quotas and limits management) (Google Documentation: Quotas & limits)

1.3 Ensure healthy communication and collaboration for operations:

  • Prevent burnout (e.g., set up automation processes to prevent burnout)
  • Foster a learning culture (Google Documentation: DevOps culture: Learning culture)
  • Foster a culture of blamelessness (Google Documentation: DevOps culture: Westrum organizational culture)

Topic 2: Building and implementing CI/CD pipelines for a service

2.1 Design CI/CD pipelines:

  • Immutable artifacts with Container Registry (Google Documentation: Help secure software supply chains on Google Kubernetes Engine, Managing images)
  • Artifact repositories with Container Registry (Google Documentation: Artifact Registry)
  • Deployment strategies with Cloud Build, Spinnaker (Google Documentation: Continuous delivery pipelines with Spinnaker and Google Kubernetes Engine)
  • Deployment to hybrid and multi-cloud environments with Anthos, Spinnaker, Kubernetes (Google Documentation: Heterogeneous Deployment Patterns with Kubernetes, Continuous delivery pipelines with Spinnaker and Google Kubernetes Engine)
  • Artifact versioning strategy with Cloud Build, Container Registry (Google Documentation: Storing build artifacts, Integrating with Cloud Build)
  • CI/CD pipeline triggers with Cloud Source Repositories, Cloud Build GitHub App, Cloud Pub/Sub (Google Documentation: Automating builds with Cloud Build, CI/CD on Google Cloud)
  • Testing a new version with Spinnaker (Google Documentation: Continuous delivery pipelines with Spinnaker and Google Kubernetes Engine)
  • Configure deployment processes (e.g., approval flows) (Google Documentation: Setting up a CI/CD pipeline for your data-processing workflow)

2.2 Implement CI/CD pipelines:

  • CI with Cloud Build (Google Documentation: Cloud Build)
  • CD with Cloud Build (Google Documentation: CI/CD on Google Cloud)
  • Open source tooling (e.g. Jenkins, Spinnaker, GitLab, Concourse) (Google Documentation: Cloud developer tools, Continuous delivery pipelines with Spinnaker and Google Kubernetes Engine)
  • Auditing and tracing of deployments (e.g., CSR, Cloud Build, Cloud Audit Logs) (Google Documentation: Audit logging)

2.3 Manage configuration and secrets:

  • Secure storage methods (Google Documentation: Cloud storage)
  • Secret rotation and config changes (Google Documentation: Secret Manager conceptual overview, Secrets management)

2.4 Manage infrastructure as code:

  • Terraform / Cloud Deployment Manager (Google Documentation: Accelerate GCP Foundation Buildout with automation)
  • Infrastructure code versioning (Google Documentation: DevOps tech: Version control, Versioning)
  • Make infrastructure changes safer (Google Documentation: Google Infrastructure Security Design Overview)
  • Immutable architecture (Google Documentation: Best practices for operating containers)

2.5 Deploy CI/CD tooling:

  • Centralized tools vs. multiple tools (single vs multi-tenant) (Google Documentation: Best practices for enterprise multi-tenancy, Cluster multi-tenancy)
  • Security of CI/CD tooling (Google Documentation: CI/CD on Google Cloud, Setting up a CI/CD pipeline for your data-processing workflow)

2.6 Manage different development environments (e.g., staging, production, etc.):

  • Decide on the number of environments and their purpose (Google Documentation: Naming Developer Environments)
  • Create environments dynamically per feature branch with GKE, Cloud Deployment Manager (Google Documentation: GitOps-style continuous delivery with Cloud Build)
  • Local development environments with Docker, Cloud Code, Skaffold (Google Documentation: Kubernetes development, simplified—Skaffold is now GA)

2.7 Secure the deployment pipeline:

  • Vulnerability analysis with Container Registry (Google Documentation: Getting vulnerabilities and metadata for images)
  • Binary Authorization (Google Documentation: Binary Authorization)
  • IAM policies per environment (Google Documentation: Policy)

Topic 3: Implementing service monitoring strategies

3.1 Manage application logs:

  • Collecting logs from Compute Engine, GKE with Stackdriver Logging, Fluentd (Google Documentation: About the Logging agent, Customizing Cloud Logging logs for Google Kubernetes Engine with Fluentd, Configuring the agent)
  • Collecting third-party and structured logs with Stackdriver Logging, Fluentd (Google Documentation: About the Logging agent, Configuring the agent)
  • Sending application logs directly to Stackdriver API with Stackdriver Logging (Google Documentation: Cloud Logging)

3.2 Manage application metrics with Stackdriver Monitoring:

  • Collecting metrics from Compute Engine (Google Documentation: Google Cloud metrics)
  • Collecting GKE/Kubernetes metrics (Google Documentation: Overview of Google Cloud’s operations suite for GKE)
  • Use metric explorer for ad hoc metric analysis (Google Documentation: Metrics Explorer)

3.3 Manage Stackdriver Monitoring platform:

  • Creating a monitoring dashboard (Google Documentation: Creating charts, Managing dashboards through the console)
  • Filtering and sharing dashboards (Google Documentation: Sharing charts, Managing dashboards through the console)
  • Configure third-party alerting in Stackdriver Monitoring (i.e., PagerDuty, Slack, etc.) (Google Documentation: Managing notification channels, Introduction to alerting)
  • Define alerting policies based on SLIs with Stackdriver Monitoring (Google Documentation: Specifying conditions for alerting policies, Managing alerting policies)
  • Automate alerting policy definition with Cloud DM or Terraform (Google Documentation: Infrastructure as code, Introduction to alerting)
  • Implementing SLO monitoring and alerting with Stackdriver Monitoring (Google Documentation: Service monitoring, Concepts in-service monitoring)
  • Understand Stackdriver Monitoring integrations (e.g., Grafana, BigQuery) (Google Documentation: Grafana and BigQuery)
  • Using SIEM tools to analyze audit/flow logs (e.g., Splunk, Datadog) (Google Documentation: Scenarios for exporting Cloud Logging data: Splunk)
  • Design Stackdriver Workspace strategy (Google Documentation: Workspaces, Managing workspaces)

3.4 Manage Stackdriver Logging platform:

  • Enabling data access logs (e.g., Cloud Audit Logs) (Google Documentation: Cloud Audit Logs, Configuring Data Access audit logs)
  • Enabling VPC flow logs (Google Documentation: Using VPC Flow Logs)
  • Viewing logs in the GCP Console (Google Documentation: Viewing logs (Classic))
  • Using basic vs. advanced logging filters (Google Documentation: Advanced logs queries, Basic logs queries)
  • Implementing logs-based metrics (Google Documentation: Overview of logs-based metrics)
  • Understanding the logging exclusion vs. logging export (Google Documentation: Overview of logs exports, Logs exclusions)
  • Selecting the options for logging export (Google Documentation: Overview of logs exports, Exporting with the Logs Viewer)
  • Implementing a project-level / org-level export (Google Documentation: Best practices for enterprise organizations, Using resource hierarchy for access control)
  • Viewing export logs in Cloud Storage and BigQuery (Google Documentation: Overview of logs exports, Exporting with the Logs Viewer)
  • Sending logs to an external logging platform (Google Documentation: Cloud Logging, Configuring the agent)

3.5 Implement logging and monitoring access control:

  • Set ACL to restrict access to audit logs with IAM, Stackdriver Logging (Google Documentation: Access control guide, Access control, Cloud Audit Logs with Cloud Storage)
  • Set ACL to restrict export configuration with IAM, Stackdriver Logging (Google Documentation: Access control guide, Access control)
  • Set ACL to allow metric writing for custom metrics with IAM, Stackdriver Monitoring (Google Documentation: Access control guide, Access control, Creating custom metrics)

Topic 4: Optimizing service performance

4.1 Identify service performance issues:

  • Evaluate and understand user impact (Stackdriver Service Monitoring for App Engine, Istio) (Google Documentation: Service Monitoring, Concepts in-service monitoring)
  • Utilize Stackdriver to identify cloud resource utilization (Google Documentation: Monitoring your Compute Engine footprint with Cloud Functions and Stackdriver)
  • Utilize Stackdriver Trace/Profiler to profile performance characteristics (Google Documentation: Introducing Stackdriver APM and Stackdriver Profiler)
  • Interpret service mesh telemetry (Google Documentation: The service mesh era)
  • Troubleshoot issues with the image/OS (Google Documentation: General troubleshooting)
  • Troubleshoot network issues (e.g., VPC flow logs, firewall logs, latency, view network details) (Google Documentation: VPC Flow Logs overview, Using VPC Flow Logs, Using Firewall Rules Logging)

4.2 Debug application code:

  • Application instrumentation (Google Documentation: Cloud Monitoring)
  • Stackdriver Debugger (Google Documentation: Cloud Debugger)
  • Stackdriver Logging (Google Documentation: Cloud Logging)
  • Stackdriver Trace (Google Documentation: Cloud Trace)
  • Debugging distributed applications (Google Documentation: Introducing Stackdriver APM and Stackdriver Profiler, Cloud Debugger)
  • App Engine local development server (Google Documentation: Using the Local Development Server)
  • Stackdriver Error Reporting (Google Documentation: Error Reporting)
  • Stackdriver Profiler (Google Documentation: Cloud Profiler)

4.3 Optimize resource utilization:

  • Identify resource costs (Google Documentation: Use labels to gain visibility into GCP resource usage and spending)
  • Identify resource utilization levels (Google Documentation: Viewing usage reports, Resource mappings from on-premises hardware to Google Cloud)
  • Develop a plan to optimize areas of greatest cost or lowest utilization (Google Documentation: Cloud cost optimization)
  • Manage preemptible VMs (Google Documentation: Preemptible Virtual Machines)
  • Work with committed-use discounts (Google Documentation: Committed use discounts)
  • TCO considerations (Google Documentation: Best practices for enterprise organizations)
  • Consider network pricing (Google Documentation: All networking pricing)

Topic 5: Managing service incidents

5.1 Coordinate roles and implement communication channels during a service incident:

  • Define roles (incident commander, communication lead, operations lead) (Google Documentation: Incident Response)
  • Handle requests for impact assessment (Google Documentation: How Requests are Handled)
  • Provide regular status updates, internal and external (Google Documentation: IP Addresses, Rolling out updates to MIGs)
  • Record major changes in incident state (When mitigated? When all clear? etc.) (Google Documentation: Incidents and events)
  • Establish communications channels (email, IRC, Hangouts, Slack, phone, etc.) (Google Documentation: Managing notification channels)
  • Scaling the response team and delegation (Google Documentation: Scaling based on Cloud Monitoring metrics)
  • Avoid exhaustion/burnout
  • Rotate / hand over roles (Google Documentation: Rotating keys)
  • Manage stakeholder relationships (Google Documentation: Best practices for enterprise organizations)

5.2 Investigate incident symptoms impacting users with Stackdriver IRM:

  • Identify probable causes of service failure (Google Documentation: Troubleshooting response errors, Errors)
  • Evaluate symptoms against probable causes; the rank probability of cause based on observed behavior (Google Documentation: Reliability)
  • Perform an investigation to isolate the most likely actual cause (Google Documentation: Local troubleshooting of a Cloud Run service)
  • Identify alternatives to mitigate the issue (Google Documentation: Best practices and reference architectures for VPC design)

5.3 Mitigate incident impact on users:

  • Rollback release (Google Documentation: Reliable releases and rollbacks)
  • Drain/redirect traffic (Google Documentation: Enabling connection draining)
  • Turn off experiment (Google Documentation: Enabling and Disabling Services)
  • Add capacity (Google Documentation: Editing instances)

5.4 Resolve issues (e.g., Cloud Build, Jenkins):

  • Code change/fix bug (Google Documentation: Report Issues and Request Features with Issue Trackers)
  • Verify fix (Google Documentation: Verify Your Bank Account)
  • Declare all-clear (Google Documentation: gcloud config configurations delete)

5.5 Document issue in a postmortem:

  • Document root causes (Google Documentation: Browsing files, Incidents & The Google Cloud Status Dashboard)
  • Create and prioritize action items (Google Documentation: Initialization actions)
  • Communicate postmortem to stakeholders (Google Documentation: Postmortem Culture)

Who this course is for:

  • All Levels
FREE $9.99 Go To Course

Report Post