SKILL BENCHMARK

SRE Proficiency (Advanced Level)

32m
32 questions

Explore all Skills Benchmarks

The SRE Proficiency benchmark measures whether a learner has had extensive exposure to SRE technologies, practices, and principles across multiple platforms. A learner who scores high on this benchmark demonstrates professional proficiency in all of the major areas of SRE operations, across a variety of different platforms and deployments.

Topics covered

define the concept of criticality, name four criticality values, and identify the purpose of criticality and each value
define the mean time between failures (MTBF) metric and outline when and how to use it for SRE work
define the mean time to resolve (MTTR) metric and outline when and how to use it for SRE work
define the mean time to respond (MTTR) metric and describe why it might be used in SRE
define what is meant by cascading failures and identify situations in which this term is used
define what is meant by operational loads, list their types, and describe how they relate to optimal performance
define what is meant by resource exhaustion and describe its consequences
describe how automation processes can vary
describe how server overloads can lead to cascading failures
describe the features and benefits of the mean time to failure (MTTF) metric and outline how to use it in SRE work
describe the purpose and characteristics of utilization signals
determine which factors are the root cause of a problem
differentiate between load shedding and graceful degradation
differentiate between SRE and DevOps
list CPU considerations as they relate to failures and overutilization
list factors that can contribute to memory exhaustion
list the core tenets of SRE
list the potential consequences of overloads, including serious illness to staff
outline how to prevent server overloads
outline processes for working with overload errors
outline steps to ensure efficient queue management
outline steps to mitigate overloads
provide an overview of common pitfalls associated with troubleshooting systems
provide an overview of Service Level Agreements
provide an overview of Service Level Objectives
provide an overview of Site Reliability Engineering
provide an overview of the primary goals of a post-mortem philosophy
provide an overview Service Level Indicators
recognize how file descriptors and threads can directly lead to failures
recognize how resource exhaustion can lead to service unavailability
recognize how resource exhaustion can travel from one resource to another
recognize the nine principles of Site Reliability Engineering

RECENTLY ADDED COURSES

Course

Site Reliability: Tools & Automation

(390)

Course

Site Reliability: Engineering

(1244)

Course

Site Reliability Engineer: Managing Overloads

(247)

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills and Salary Report

ESG Impact Report

SRE Proficiency (Advanced Level)

Topics covered

RECENTLY ADDED COURSES