Site Reliability: Engineering
SRE
| Intermediate
- 13 videos | 1h 5m 8s
- Includes Assessment
- Earns a Badge
Site Reliability Engineers are often considered the link between software development and operations. In this course, you'll explore the principles of site reliability engineering as well as common concerns such as measuring and managing risk, and risk tolerance. You'll also learn how to ensure a satisfactory level of service by implementing Service Level Objectives, Service Level Agreements, and Service Level Indicators.
WHAT YOU WILL LEARN
-
Discover the key concepts covered in this courseProvide an overview of site reliability engineeringRecognize the nine principles of site reliability engineeringList the core tenets of sreDifferentiate between sre and devopsProvide an overview service level indicatorsProvide an overview of service level objectives
-
Provide an overview of service level agreementsRecognize how to embrace and manage risk in an environmentRecognize how to measure service risk using metrics such as time-based availability and aggregate availabilityIdentify the risk tolerance of infrastructure servicesProvide an overview of error budgetsSummarize the key concepts covered in this course
IN THIS COURSE
-
1m 17s
-
4m 31sIn this video, you'll learn more about the Site Reliability Engineer or SRE and what the job involves. You'll learn that although the concept isn't new, the role is becoming increasingly common in today's organizations. The SRE bridges the gap between operations and development to build scalable and highly protected systems. FREE ACCESS
-
7m 30sIn this video, you'll learn more about the 9 core Principles of Site Reliability Engineering and the implementation of a DevOps approach to building and maintaining your organization. You'll learn that the primary responsibility of the SRE will be writing code. The second is to build a team of site reliability engineers who can draw from a pool of developers to help ensure that the system is reliable. The third deals with recognizing the core capabilities of your development team. Explore these and other key principles by watching this video. FREE ACCESS
-
9m 13sIn this video, you'll learn more about the core tenets of a site reliability engineer. These include managing availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. Each one of them in one way or another will help improve the overall reliability of the site it's services, and applications. The video outlines the ten core activites the SRE will be involved with. FREE ACCESS
-
6m 47sIn this video, you'll learn more about the distinctions between what is considered to be a DevOps position and being a Site Reliability Engineer. You'll learn that in many cases, the terms are used interchangeably. In some situations, the roles are combined. But in other situations, they're distinct. You'll learn that the goal of an SRE is to bridge the gap between Dev and Ops thereby creating a DevOps role. FREE ACCESS
-
5m 37sIn this video, you'll learn more about the concept of a Service Level Indicator. You'll learn about how these deal with specific aspects of a service and how they deliver actual measured values. You'll learn there are several key indicators that are of interest to both the provider and consumer, including request latency which measures the actual time it takes to respond to a request. FREE ACCESS
-
6mIn this video, you'll learn more about the concept of availability. It refers to how well a system can fulfill its purpose and function reliably. Toward that end, the service level objective or SLO defines a precise numerical target for availability which then defines a benchmark against which future performance can be compared. Explore this subject by watching this video. FREE ACCESS
-
3m 48sIn this video, you'll learn more about the Service Level Agreement or the SLA which is a contract between a provider and a consumer that defines the level of service to be expected. You'll learn that the typical components of an SLA include clearly defined metrics such as the speed of a service, the responsibilities in terms of who looks after what, and expectations of what should be done for maintenance and upkeep. FREE ACCESS
-
3m 8s
-
7m 27sIn this video, you'll learn more about methods for measuring the level of risk that can be associated with a system or a service. This usually starts with establishing a target for any given metric or performance value. And when it comes to measuring risk, there are many considerations you should make in terms of what might result from a failure including customer dissatisfaction or a loss of trust, lost revenue, and customers. Watch this video to find out more about measuring risk, and acceptable levels for unplanned downtime. FREE ACCESS
-
4m 48sIn this video, you'll learn more about considerations for establishing an acceptable level of risk or risk tolerance. As in any service or solution, there are often a lot of moving parts, with each one having different considerations when it comes to risk. You'll discover there is always a risk of physical failure with components such as hardware. This video explores these topics and how to take a top-down approach, focusing on the risk associated with individual components. FREE ACCESS
-
4m 11sIn this video, you'll learn more about error budgets. These refer to the amount of acceptable downtime for any given service or system, which is then used to develop new features or improvements. But the very nature of the error budget inherently includes the possibility for conflicts to arise between the product development teams and the site reliability engineering teams because they themselves are in fact at odds with each other, again due to the nature of their jobs. This video outlines these issues. FREE ACCESS
-
51s
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.