Site Reliability: Tools & Automation
SRE
| Intermediate
- 14 videos | 52m 53s
- Includes Assessment
- Earns a Badge
There are numerous tools available to Site Reliability Engineers to help with planning, managing, deploying, automating, and monitoring services and infrastructure. In this course, you'll explore these tools as well some the benefits of automation and the automation process. You'll also discover common pitfalls and failures, as well as how to manage of post-mortem incidents.
WHAT YOU WILL LEARN
-
Discover the key concepts covered in this courseProvide an overview of planning tools such as jira and pivotal trackerDifferentiate between tools used for creation such as github and subversionDescribe common tools used for packaging and releasing services and releasesDifferentiate between different tools used to automate functionsProvide an overview of tools used to monitor applications and infrastructureDescribe the value of automation including consistency, platform, repairs, and time savings
-
Provide an overview of uses cases for automationDescribe the path that the evolution of automation followsDescribe how automation processes can varyProvide an overview of common pitfalls associated with troubleshooting systemsProvide an overview of the primary goals of a post-mortem philosophyDetermine which factors are the root cause of a problemSummarize the key concepts covered in this course
IN THIS COURSE
-
1m 21s
-
4m 28sIn this video, you'll learn more about the tools commonly used by Site Reliability Engineers in their work. Although there are many tools available, there isn't a standardized toolset and instead, SREs select their own internal standardization, such as JIRA and Pivotal Tracker for planning. FREE ACCESS
-
2m 54sIn this video, you'll learn more about the tools used for Creation including GitHub and Subversion. You'll learn that the term Creation here refers to development and building applications within a site reliability engineering context. Although the SRE may not be as involved as someone who is a developer, they still have a role in ensuring that applications are built for easier management. FREE ACCESS
-
3m 14sIn this video, you'll learn more about the package and release tools used by the site reliability engineer. These include Container orchestration services such as Kubernetes, as well as mesosphere and some other verification tools. Starting with Kubernetes, this is a platform for automating deployment, scaling, and providing flexibility for managing containerized applications. FREE ACCESS
-
3m 37sIn this video, you'll learn more about examples of the configuration tools used by the Site Reliability Engineer. You'll learn how both Terraform and Ansible allow the SRE to automate and manage the configuration of infrastructure and applications. You'll also discover that the goal of the SRE is to automate as much work as possible. This means reducing manual configuration and management tasks. FREE ACCESS
-
5m 41sIn this video, you'll learn more about Monitoring tools that can be of use to the site reliability engineer. In general, there are many different levels of monitoring, and each type generally involves the collection of metrics either for a specific application or throughout the entire infrastructure, again, depending on what is being monitored.T his video provides an overview of this concept as well as how flexibility can be implemented through the New Relic Metric API and other types. FREE ACCESS
-
5m 10s
-
3m 38sSite reliability engineers often use automation to scale security and performance. In this video, you'll learn about Use Cases for Automation. There are many processes that are good candidates for automation, and the choice is up to you. You'll start by examining some common examples and then taking a brief look at tools (such as Puppet and Chef) that can be used to help you implement your automation easily. FREE ACCESS
-
4m 40sSite reliability engineering uses automation to simplify work processes. In this video, you'll learn more about an example of how automation can evolve within an organization. You'll use a simple database that requires failover as an example and look at the evolution of automation from no automation to manual intervention. Next, discover examples of solutions that are implemented externally and system-specific. Finally, you'll learn about options that don't need any automation in the first place. FREE ACCESS
-
3m 37sSite reliability engineering uses automation to simplify work processes. In this video, you'll learn more about how the Automation process can vary depending on three key factors. These are Competence which refers to the accuracy of the process itself in terms of the tasks being completed; Latency, which generally refers to the speed at which the process completes; and Relevance, which refers to how appropriately the process covered by automation was applied. FREE ACCESS
-
4m 10sThe site reliability engineer is responsible for resolving incidents and automating operational tasks. In this video, you'll learn more about the common pitfalls of dealing with an automated system over the longer term, such as troubleshooting in an ineffective manner. You'll discover that most of which stem from a lack of understanding of what the solution was designed for originally and how it was implemented. Watch this video to find out more. FREE ACCESS
-
5m 25sIn this video, you'll learn more about the philosophy that underpins the process of creating a Postmortem analysis. You'll learn it's a summary of the entire lifespan of an issue. You'll discover a postmortem focuses on the problem, what it was and how it was corrected, not so much the cost or the resources required. The more structured the format is before you begin, the easier it will be to formulate the completed report. FREE ACCESS
-
3m 51sIn this video, you'll learn more about approaches to determine the actual root cause of a failure in a site reliability engineering context. You'll learn that any proposed resolution should have mutually exclusive alternatives. This means it should rule out one set of possibilities while ruling in others. You'll learn that one of the best approaches is to consider what's most obvious first. Explore this video to find out more. FREE ACCESS
-
1m 7s
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.