Welcome!

Apache Authors: Pat Romanski, Liz McMillan, Elizabeth White, Christopher Harrold, Janakiram MSV

Blog Feed Post

Move over Reliability, Resilience has arrived

[This article was originally written as a guest post for Puppet Labs and published at their blog on January 9th, 2014.]

If you haven’t yet noticed that prioritization of non-functional requirements (NFRs) is changing amongst your user base, you will soon. For decades, we have held to the same familiar set of NFRs. Every team had its own definition and particular spin on NFRs, but the usual suspects are accessibility, availability, extensibility, interoperability, maintainability, performance, reliability, scalability, security, and usability.

But new priorities have surfaced, as IT has experienced a sea change over the past few years. Some organizations have even adopted completely new NFRs. The rise of DevOps has coincided with these changes, and the movement’s principles enable IT teams to more readily adapt to rapidly changing requirements.

Your grandfather’s mainframe was very reliable

Historically, IT system designs were praised for reliability. Robust and stable systems could “take a licking and keep on ticking.” As computing became more pervasive, scalability became the watchword. Systems should be able to grow and expand to meet increasing demands.

Scalability as an NFR priority represents just a slight shift from reliability as an NFR. Both operated off the mindset that the original system design was valid. Reliability ensures that the system continues to provide the stated functionality over time, and scalability ensures that you can do so for an increasing demand set.

Roughly 10 years ago, things began to shift as more and more organizations embraced movements like agile or XP, and architectural models like Service Oriented Architecture (SOA). These initiatives promoted adaptation and response to change as desirable system qualities. Next, cloud computing introduced us to the notion of elasticity, further promoting the values of flexibility and responsiveness to change.

A resilient system is a happy system

The state of the art for system design is always evolving, and we see noticeable leaps forward every few years. The current phase of evolution is toward resilient systems.

Legacy system designs relied upon expensive infrastructure with multiple-redundant-hot-swappable-live-backup-standby-continuity-generators (or whatever vendors are peddling lately). In contrast, resilient system designs embrace failure and promote the use of cheap, commodity hardware, coupled with distributed data management, parallel processing, eventual consistency, and self-healing operational nodes.

Some portion of your system is likely to go down at some point, and resilient systems are designed with that expectation. Resilient systems and resilient processes are able to continue operation (albeit at diminished capacity) in the face of failure.

The prioritization of resilience over reliability as an NFR can be seen within the DevOps movement, the development of the Netflix Simian Army, and the rise of NoSQL data management solutions.

DevOps and resiliency

DevOps is a multi-headed beast, more a movement guided by a set of principles than a tangible and well-defined construct. While organizations are free to adopt aspects of DevOps that suit their needs, one common thread is that of resilience. Failure is seen as an opportunity to improve processes and communication, rather than as a threat.

The principles of continuous integration and continuous delivery that are core to most DevOps practices exemplify a resilient mindset. Where the classic waterfall model relies upon detailed front-end design and planning with an all-or-nothing development phase and late-stage testing, DevOps teams are more agile, embracing a “fail early, fail often” model. This approach results in more resilient and adaptable applications.

Netflix Simian Army

Netflix gained world renown when the company broadcast details of its Simian Army work in 2010 and 2011. Through the automated efforts of Chaos Monkey, Chaos Gorilla, and a slew of other similar utilities, failure is simulated in order to develop more resilient processes, tools, and capabilities.

John Ciancutti of Netflix writes, “If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most — in the event of an unexpected outage.”

NoSQL

A third illustration of the growing fascination with resilient, self-healing systems is the transformation now going on in the data realm. Data and metadata management have evolved considerably from the relational databases of yore. Modern data management strategies tend to be distributed, fault-tolerant, and in some cases even self-heal by spawning new nodes as needed. Examples include Google FS / Bigtable, in-memory datastores like Hazelcast or SAP’s HANA, and distributed data management solutions like Apache Cassandra.

Miko Matsumura of Hazelcast notes, “Virtualization and scale-out power new ways of thinking about system stability, including a shift away from ‘reliability,’ where giant expensive systems never fail (until they do, catastrophically), and towards ‘resiliency,’ where thousands of inexpensive systems constantly fail—but in ways that don’t materially impact running applications.”

Keeping pace with the cool kids

It’s often said that the only constant is change. The DevOps movement positions organizations to embrace change, rather than fear it. Continuous integration, continuous delivery, and continuous feedback loops between dev teams and ops teams facilitate an enhanced degree of agility and responsiveness.

As business and society evolve, our system design priorities must adapt in parallel. The cool kids will change the game again at some point, but for right now, “change” means designing systems and supporting processes that are responsive and adaptable by prioritizing resilience over reliability.

Read the original blog entry...

More Stories By Kyle Gabhart

Kyle Gabhart is a subject matter expert specializing in strategic planning and tactical delivery of enterprise technology solutions, blending EA, BPM, SOA, Cloud Computing, and other emerging technologies. Kyle currently serves as a director for Web Age Solutions, a premier provider of technology education and mentoring. Since 2001 he has contributed extensively to the IT community as an author, speaker, consultant, and open source contributor.

@ThingsExpo Stories
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, will discuss how they b...
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
Digital transformation is changing the face of business. The IDC predicts that enterprises will commit to a massive new scale of digital transformation, to stake out leadership positions in the "digital transformation economy." Accordingly, attendees at the upcoming Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA, Oct 31-Nov 2, will find fresh new content in a new track called Enterprise Cloud & Digital Transformation.
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp emp...
SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere delivers a more modern architectural approach to storage that doesn't require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbui...
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software. They hope to capture value from emerging technologies such as IoT, SDN, and AI. Ultimately, irrespective of the vertical, it is about deriving value from independent software applications participating in an ecosystem as one comprehensive solution. In his session at @ThingsExpo, Kausik Sridhar, founder and CTO of Pulzze Systems, will discuss how given the magnitude of today's applicati...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Tha...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, will lead you through the exciting evolution of the cloud. He'll look at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering ...
SYS-CON Events announced today that Taica will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TAZMO technology and development capabilities in the semiconductor and LCD-related manufacturing fields are among the best worldwide. For more information, visit https://www.tazmo.co.jp/en/.
SYS-CON Events announced today that Avere Systems, a leading provider of hybrid cloud enablement solutions, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere Systems was created by file systems experts determined to reinvent storage by changing the way enterprises thought about and bought storage resources. With decades of experience behind the company’s founders, Avere got its ...
SYS-CON Events announced today that TidalScale will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale is the leading provider of Software-Defined Servers that bring flexibility to modern data centers by right-sizing servers on the fly to fit any data set or workload. TidalScale’s award-winning inverse hypervisor technology combines multiple commodity servers (including their ass...
As hybrid cloud becomes the de-facto standard mode of operation for most enterprises, new challenges arise on how to efficiently and economically share data across environments. In his session at 21st Cloud Expo, Dr. Allon Cohen, VP of Product at Elastifile, will explore new techniques and best practices that help enterprise IT benefit from the advantages of hybrid cloud environments by enabling data availability for both legacy enterprise and cloud-native mission critical applications. By rev...
SYS-CON Events announced today that Ryobi Systems will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ryobi Systems Co., Ltd., as an information service company, specialized in business support for local governments and medical industry. We are challenging to achive the precision farming with AI. For more information, visit http:...
Amazon is pursuing new markets and disrupting industries at an incredible pace. Almost every industry seems to be in its crosshairs. Companies and industries that once thought they were safe are now worried about being “Amazoned.”. The new watch word should be “Be afraid. Be very afraid.” In his session 21st Cloud Expo, Chris Kocher, a co-founder of Grey Heron, will address questions such as: What new areas is Amazon disrupting? How are they doing this? Where are they likely to go? What are th...
High-velocity engineering teams are applying not only continuous delivery processes, but also lessons in experimentation from established leaders like Amazon, Netflix, and Facebook. These companies have made experimentation a foundation for their release processes, allowing them to try out major feature releases and redesigns within smaller groups before making them broadly available. In his session at 21st Cloud Expo, Brian Lucas, Senior Staff Engineer at Optimizely, will discuss how by using...
In this strange new world where more and more power is drawn from business technology, companies are effectively straddling two paths on the road to innovation and transformation into digital enterprises. The first path is the heritage trail – with “legacy” technology forming the background. Here, extant technologies are transformed by core IT teams to provide more API-driven approaches. Legacy systems can restrict companies that are transitioning into digital enterprises. To truly become a lead...
SYS-CON Events announced today that Daiya Industry will exhibit at the Japanese Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ruby Development Inc. builds new services in short period of time and provides a continuous support of those services based on Ruby on Rails. For more information, please visit https://github.com/RubyDevInc.
As businesses evolve, they need technology that is simple to help them succeed today and flexible enough to help them build for tomorrow. Chrome is fit for the workplace of the future — providing a secure, consistent user experience across a range of devices that can be used anywhere. In her session at 21st Cloud Expo, Vidya Nagarajan, a Senior Product Manager at Google, will take a look at various options as to how ChromeOS can be leveraged to interact with people on the devices, and formats th...