Topic: sre

Catchpoint 2021 SRE report cover

Report: Cloud application monitoring remains the biggest challenge for SREs

Just over half of SREs (53%) said that the number one cloud application monitoring challenge is unified visibility across the stack. Organizations are looking toward AI and machine learning to solve these problems, but adoption of AIOps is slow. This is according to the 2021 SRE Report that was conducted by the digital experience monitoring … continue reading

Report: digital transformation initiatives are causing increased adoption of SRE in ITOps

A new study found that the increased need for process automation and SREs has been fueled by companies’ increase in digital transformation initiatives as well as remote and hybrid work policies.  Industries have seen a 90% increase in customer-affecting issues and 68% of businesses reported an increased cost of downtime since the pandemic began.  Now, … continue reading

xMatters announces new adaptive incident management features

xMatters’ new adaptive incident management feature advancements provide increased automation across each stage of the incident management lifecycle – diagnosis and collaboration, resolution and post-incident learning.  An increase in the number of change-related incidents and the furious speed of new software releases demand more automation be applied across the incident management lifecycle to accelerate actions.  … continue reading

Puppet announces Relay for cloud-native, event driven automation

Puppet introduced the public beta of Relay, an event-driven automation platform that automates across any cloud infrastructure, tools and APIs that developers, DevOps engineers, and SREs are managing manually currently. “Without a way to manage and automate the flood of events and hundreds of APIs developers use – time, money and mental capital are being … continue reading

Red Hat: Culture change and automation are necessary to create DevOps and SRE teams

In order to bring more effective operational practices, DevOps and site reliability engineering (SRE) teams need to go through a culture change within the organization. Red Hat held its virtual summit this week where it talked about how to reinvent IT Ops as SRE.  According to the company, change can happen by automating processes and … continue reading

ITOps Times Open-Source Project of the Week: Kubei

Portshift has announced the release of Kubei, an open-source Kubernetes runtime vulnerabilities scanner tool.  According to the company, while there are a lot of options already out there, not all scanners are the same and differ by the number of feeds they consume, updates they product and information they provide.  ‘All tools, however, require some … continue reading

Google details how to apply SRE to monolith applications

Microservices are taking the software industry by storm, but that doesn’t mean monolithic applications are becoming extinct. While SRE is more commonly associated with modern architectures, Google is providing some insight into how enterprises can use SRE to manage their monoliths.  “When and why to choose monolithic architecture is usually a matter of what works … continue reading

Transitioning to SRE

Over the years, there have been a lot of new methodologies that aim to help an organization manage their technology more efficiently, whether that means making programmers more efficient or the operators who manage a company’s technology infrastructure. DevOps, which sought to bring developers and operators together, is one such example of this, and one … continue reading

ITOps 2020 predictions from around the industry

Tim Armandpour, SVP of engineering at PagerDuty Forget reliability — with the adoption of resilience engineering and the proper use of automation, operators can expect a 20% reduction in unplanned work. Today’s organizations are fixated on the reliability of their technology. But any developer can tell you that the reality is not if it will … continue reading

DevOps Institute announces a new course for site reliability engineering

The DevOps Institute today announced a Site Reliability Engineering (SRE) Foundation certification that will be available to registered education partners starting in January of next year. While there is no prerequisite to take the examination, a training course through an accredited partner is required.  According to the company, the certification content includes practical advice, related … continue reading

Making site reliability Blameless

As site reliability becomes more important as software releases grow in frequency and complexity, a startup called Blameless today released an SRE platform that can handle the increasing velocity of code deployments while offering faster, more efficient incident resolution. Ashar Rizqi, CEO of Blameless, said the company’s vision is to enable any modern software business … continue reading

Google introduces Stackdriver IRM for Site Reliability Engineering

Google announced a new Site Reliability Engineering-inspired tool for investigating, understanding, mitigating and recovering from incidents quickly and efficiently. Stackdriver Incident Response and Management (IRM) on Google Cloud Platform is available as an alpha version and features new monitoring tools for SRE journeys. After facing availability and reliability challenges, Google created SRE and SRE principles … continue reading

Get access to this and other exclusive articles for FREE!

There's no charge and it only takes a few seconds.

Sign up now!