Google Professional DevOps Engineer: Applying SRE Practices to a Service

Google Cloud 2024    |    Beginner
  • 15 videos | 1h 40m 33s
  • Earns a Badge
Building and maintaining resilient, high-performing services in Google Cloud Platform (GCP) involves a solid understanding of site reliability engineering (SRE) principles and practices and the delicate balance and implementation of change, velocity, and reliability. In this course, you will learn the foundations of SRE, its core principles, and how to apply them effectively within GCP environments. Gain insights into defining service-level indicators (SLIs), service-level objectives (SLOs), and service-level agreements (SLAs), while exploring strategies for managing the service life cycle, capacity planning, and autoscaling. Develop an understanding of incident response, postmortem analysis, and toil reduction. Finally, discover how to foster collaboration, prevent burnout, and drive continuous improvement using feedback loops. This course is one of a collection that prepares learners for the Google Professional Cloud DevOps Engineer exam.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Provide an overview of site reliability engineering (sre)
    Define the concepts of change, velocity, and reliability in the context of sre
    Identify key service-level indicators (slis) for cloud services
    Explain the process of defining service-level objectives (slos) and the significance of service-level agreements (slas)
    Identify strategies for managing service life cycle, including deployment and maintenance
    Outline the processes of capacity planning and autoscaling in google cloud platform (gcp)
    Identify methods for ensuring healthy communication and collaboration within devops teams
  • Outline strategies for mitigating incident impact on users and services
    Outline the process of conducting an effective postmortem analysis
    Outline how to automate toil and the potential impact on service reliability
    Compare different strategies for preventing burnout among operations teams
    Analyze a case study focusing on the implementation of sre practices in a gcp environment
    Recognize the role of feedback loops in continuous service improvement
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 2m 37s
    In this video, we will discover the key concepts covered in this course. FREE ACCESS
  • 11m 51s
    After completing this video, you will be able to provide an overview of site reliability engineering (SRE). FREE ACCESS
  • Locked
    3.  Change, Velocity, and Reliability
    7m 11s
    Upon completion of this video, you will be able to define the concepts of change, velocity, and reliability in the context of SRE. FREE ACCESS
  • Locked
    4.  Service-level Indicators (SLIs) for Cloud Services
    6m 50s
    After completing this video, you will be able to identify key service-level indicators (SLIs) for cloud services. FREE ACCESS
  • Locked
    5.  Service-level Objectives (SLOs) and Service-level Agreements (SLAs)
    9m 55s
    Upon completion of this video, you will be able to explain the process of defining service-level objectives (SLOs) and the significance of service-level agreements (SLAs). FREE ACCESS
  • Locked
    6.  Strategies for Managing Service Life Cycle
    6m 57s
    After completing this video, you will be able to identify strategies for managing service life cycle, including deployment and maintenance. FREE ACCESS
  • Locked
    7.  Capacity Planning and Autoscaling in Google Cloud Platform (GCP)
    9m
    Upon completion of this video, you will be able to outline the processes of capacity planning and autoscaling in Google Cloud Platform (GCP). FREE ACCESS
  • Locked
    8.  Healthy Communication and Collaboration within DevOps Teams
    8m 26s
    After completing this video, you will be able to identify methods for ensuring healthy communication and collaboration within DevOps teams. FREE ACCESS
  • Locked
    9.  Mitigating Incident Impact on Users and Services
    5m 11s
    Upon completion of this video, you will be able to outline strategies for mitigating incident impact on users and services. FREE ACCESS
  • Locked
    10.  Postmortem Analysis
    5m 25s
    After completing this video, you will be able to outline the process of conducting an effective postmortem analysis. FREE ACCESS
  • Locked
    11.  Toil Automation and Service Reliability
    6m 20s
    Upon completion of this video, you will be able to outline how to automate toil and the potential impact on service reliability. FREE ACCESS
  • Locked
    12.  Strategies for Preventing Burnout among Operations Teams
    7m 12s
    After completing this video, you will be able to compare different strategies for preventing burnout among operations teams. FREE ACCESS
  • Locked
    13.  SRE Practices Case Study
    6m 33s
    In this video, we will analyze a case study focusing on the implementation of SRE practices in a GCP environment. FREE ACCESS
  • Locked
    14.  Role of Feedback Loops in Continuous Service Improvement
    6m
    Upon completion of this video, you will be able to Recognize the role of feedback loops in continuous service improvement. FREE ACCESS
  • Locked
    15.  Course Summary
    1m 8s
    In this video, we will summarize the key concepts covered in this course. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.