Site Reliability Engineering: Scenario Planning
SRE
| Intermediate
- 21 videos | 1h 11m 11s
- Includes Assessment
- Earns a Badge
Scenario planning helps site reliability engineers strategically prepare for uncertainties that may disrupt or negatively affect services. In this course, you'll explore scenario planning use cases and the strategies utilized to prepare for disasters. You'll examine the functions of Disaster Recovery Testing (DiRT) and Customer Reliability Engineering teams, which help manage the impact of a disaster or disruption. Next, you'll identify disaster recovery testing events and recognize how to plan and design tests for DiRT. You'll move on to describe the production incident lifecycle and how to minimize production incidents. You'll identify unmanaged responses, how to rectify untrained responses, and the activities used to train response teams. Finally, you'll examine how to test people and how they self-organize and interact using various role-playing and test scenarios.
WHAT YOU WILL LEARN
-
Discover the key concepts covered in this courseDefine scenario planning and identify why it should be part of your strategic planDescribe how to use scenario planning and how to create scenariosRecognize considerations when scenario planning for a disasterIdentify potential scenarios to test and prepare for, such as the loss of technical infrastructure or environmental issuesList common data-related disaster recovery scenarios to plan forList common applications-related disaster recovery scenarios to plan forProvide an overview of disaster recovery testing events and how they can help identify vulnerabilities in critical systemsList what to test when designing tests for dirtRecognize how to minimize the potential damage of disruptive dirt testsProvide an overview of the dirt technical team and the coordination team
-
List common components of a dirt test plan and how creating a template is useful for future test plan proposalsOutline the functions of a customer reliability engineering team and their role in scenario planningOutline the production incident lifecycle and how to lay the foundations to shrink production incidentsProvide an overview of unmanaged responsesDescribe how to rectify untrained responsesRecognize hands-on activities used to train response teamsDescribe how dirt exercises should also test how people organize themselves and interact with each otherProvide an overview of the "wheel of misfortune" role-playing scenarioProvide an overview of the dungeon/scenario master and their role in running a test scenarioSummarize the key concepts covered in this course
IN THIS COURSE
-
1m 50s
-
2m 4sIn this video, you'll learn more about the concept of Scenario Planning and making it a part of your Organization Decisions as you move forward. You'll learn that by defining which predictions are most likely to occur, you can be better prepared if and when those situations actually develop. FREE ACCESS
-
3m 9sIn this video, you'll learn how to use scenario planning and how to create your own scenarios. You'll learn that formulating what you consider the four most likely scenarios for your organization is challenging but most organizations can determine which are more likely than others. You'll also learn that focusing on two of the driving forces is key when developing scenarios and identifying their impact and implications. FREE ACCESS
-
4m 45sIn this video, you'll learn more about planning for disaster in your organization. You'll learn that when it comes to planning out scenarios for your organization, one of the most important considerations is planning for disaster. This means anything that can cause an interruption to your business. So in this video, you'll examine some considerations to help ensure you're able to recover your operations with minimal disruption. FREE ACCESS
-
5m 8sIn this video, you'll learn more about planning out disaster recovery scenarios. You'll discover there can be many different situations that constitute a disaster. It's important to try to identify scenarios that are most likely to occur, but it's equally important to test your strategies for dealing with each of them. This video explores some common disaster scenarios and possible mitigations that can be tested. FREE ACCESS
-
4m 17sIn this video, you'll learn more about the considerations needed when formulating Disaster Recovery Scenarios that are specific to ensuring the ability to recover your data. You'll learn that the term data itself usually refers to anything stored in an unstructured manner such as documents. Most environments also have structured data, such as relational databases. But the first step is to implement backups of both your data and your databases. FREE ACCESS
-
8m 59sIn this video, you'll learn more about considerations when formulating a disaster recovery plan with respect to your applications. You'll learn there are common examples such as batch processes, ecommerce websites, and video streaming. These illustrate several different levels of priority when it comes to how urgent their recovery might be. FREE ACCESS
-
3m 57sIn this video, you'll learn more about Disaster Recovery Testing or DiRT. You'll discover that while an organization may feel they are prepared for disasters, this can't really be known until disaster strikes. Instead of waiting for an actual disaster, simulations can be run to test how effective your strategies are. This video outlines the process of Disaster Recovery Testing and what it involves. FREE ACCESS
-
2m 5sIn this video, you'll learn more about specific examples of what to test for when designing disaster recovery tests. You'll learn these can help shed light on the possible scenarios that affect organizations. So initially, you might look at simpler cases involving service-specific testing. This includes ensuring any given service has fault tolerance configured. FREE ACCESS
-
3m 7s
-
2m 5sIn this video, you'll learn more about the two core teams that should be involved with disaster recovery testing. You'll learn about a technical team and a coordination team. The technical team is responsible for the initial design of all tests, as well as evaluating them to determine their effectiveness and the impact on the target systems. Once any given test is deemed ready for implementation, the technical team will also be responsible for monitoring during execution. FREE ACCESS
-
3m 57sIn this video, you'll learn more about Disaster Recovery Test components and how to create a Test Plan Proposal. You'll discover the proposal should begin by outlining the General Information of the test and should be used for all tests to ensure that the documentation is always consistent and easily understood by all parties involved. Watch this video to learn more about the specific components covered in this demo. FREE ACCESS
-
4m 28sIn this video, you'll learn more about the Customer Reliability Engineer or CRE. Like the site reliability engineer, it's an extension of user support but particularly with respect to the adoption of cloud services. Many organizations are still hesitant to adopt cloud services primarily due to the loss of control over hardware and infrastructure, applications, and even data. The CRE exists to address this anxiety. FREE ACCESS
-
4m 41s
-
2m 14s
-
2m
-
4m 8sIn this video, you'll learn more about the methods that are generally the most effective when it comes to providing training for your incident response teams. You'll learn that there is value in book learning, video learning or classroom learning. But none of these are as effective as hands-on experience because failure is concerned. It's a matter of pattern recognition here for the trainers. FREE ACCESS
-
2m 21s
-
2m 18sIn this video, you'll learn more about a training method known as the Wheel of Misfortune. This is a role-playing scenario that uses simulated emergencies to test the responses of your teams or those who are in training. You'll learn that there's no specific response strategy in place because everything is done in a low-risk environment. FREE ACCESS
-
2m 17sIn this video, you'll learn more about the role of the Scenario Master or sometimes also referred to as the Dungeon Master when using the Wheel of Misfortune, or other scenario testing methods for training your disaster response teams. You'll learn that there are certain requirements when it comes to implementing the Dungeon Master. If you think of the scenario as something similar to a play, they're thought of as the director. FREE ACCESS
-
1m 23sIn this video, you'll summarize what you've learned in the course. You've discovered the use of scenario planning and strategies to minimize and manage disaster or disruption impact. You also learned about the benefits of disaster recovery testing and Customer Reliability Engineering teams. You explored scenario planning, why and how to use it, and how to create your own scenarios. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.