SKILL BENCHMARK

SRE Competency (Intermediate Level)

25m
42 questions

Explore all Skills Benchmarks

The SRE Competency benchmark measures whether a learner has project-level exposure in SRE technologies, practices, and principles across multiple platforms. A learner who scores high on this benchmark demonstrates professional competency in all of the major areas of SRE operations, across a variety of different platforms and deployments.

Topics covered

define what is meant by a process-induced emergency, describe the effects of them, and outline how to respond to them
describe common tools used for packaging and releasing services and releases
describe how automation processes can vary
describe the characteristics and purpose of blackbox monitoring
describe the characteristics and purpose of whitebox monitoring
describe the path that the evolution of automation follows
describe the value of automation including consistency, platform, repairs, and time savings
describe what is meant by each one of the 'three Cs' of incident management (coordinate, communicate, and control)
describe why it is vital to keep a history of outages and mistakes and outline best practices when doing so
describe why SREs might carry out reliability testing
determine which factors are the root cause of a problem
differentiate between different tools used to automate functions
differentiate between SRE and DevOps
differentiate between tools used for creation such as GitHub and Subversion
list common Google SRE use cases for automation
list standard factors that can influence software reliability
list the core tenets of SRE
name and describe some common SRE metrics
name the causes and outcomes of change-induced emergencies and outline how to respond to these emergencies
outline the fundamental emergency response principles SREs need to be familiar with and recognize the critical steps to take when a system breaks
outline the process and purpose of logging and name the benefits of text logs
outline what comprises a private cloud, recognize which cloud service models can be delivered in them, describe ways to use them, and distinguish the advantages and disadvantages of their use
outline what's involved in reliability testing and describe testing techniques, such as unit, integration, system, production, stress, and rollouts entangle tests
provide an overview of automation classes and describe the path the evolution of automation follows
provide an overview of common pitfalls associated with troubleshooting systems
provide an overview of planning tools such as JIRA and Pivotal Tracker
provide an overview of Service Level Agreements
provide an overview of Service Level Objectives
provide an overview of Site Reliability Engineering
provide an overview of the primary goals of a post-mortem philosophy
provide an overview of tools used to monitor applications and infrastructure
provide an overview of uses cases for automation
provide an overview Service Level Indicators
recognize how to embrace and manage risk in an environment
recognize how to measure service risk using metrics such as time-based availability and aggregate availability
recognize how to use PowerShell for automation tasks in Windows
recognize the advantages and considerations when automating all the things
recognize the benefits of performing test-induced emergencies and outline what this involves
recognize the importance of incident response planning and the characteristics of incidence response plans
recognize the nine principles of Site Reliability Engineering
restate the duties of the prominent job roles involved in incident response (Incident Commander, Communications Lead, and Operations Lead) as well as those of other, supporting roles
summarize the requirements, goals, best practices, job roles, and tools involved in managing and responding to incidents

RECENTLY ADDED COURSES

Course

SRE Troubleshooting: Tools

(265)

Course

Site Reliability: Tools & Automation

(390)

Course

SRE Testing Tasks: Software Reliability & Testing

(67)

Course

Best Practices for the SRE: Use Cases for Automation

(221)

Course

SRE Emergency & Incident Response: Incident Response

(291)

Course

SRE Emergency & Incident Response: Responding to Emergencies

(288)

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills and Salary Report

ESG Impact Report

SRE Competency (Intermediate Level)

Topics covered

RECENTLY ADDED COURSES