Rethinking N+1 redundancy design

Rethinking N+1 redundancy design guidelines in the public utility industry

How often do we challenge long accepted design practices, especially in the utility industries such as power, water and wastewater? In order to do so, we must look outside public utilities and apply lessons from other industries. Airline and auto industries are two that rely less on redundancy by focusing on increasing the reliability of the primary functions and managing the consequences and associated risk of failure.

How the airline industry focused on increasing reliability

The airline industry is traditionally held up as the leader in reliability thinking with their initial application of Reliability Centered Maintenance (RCM), but they also have been in the forefront in design. At one time, cross-Atlantic flights required airplanes with at least four engines. With the increase in the reliability of jet engines, in 1984 it became the norm to fly over the Atlantic with only two engines. How did this happen? The airline industry focused on increasing the reliability of the primary functions (making the protected function more reliable) instead of counting on backup systems. The saving in weight, fuel consumption and maintenance cost led to enormous savings in the airline industry.

The auto industry continues to rethink N+1

Another example is the spare tire in an automobile. For many years, cars carried a full extra spare in case there was ever a flat. Car designers thought of unique ways of storing this tire including in a trunk, outside the back door, and in the undercarriage. Recently, some designers began to question the need for the full spare. How many people ever check their spare anyway? How many people even know how to change a spare tire anymore? A full spare weighs a lot, takes up space and can’t be used for long since the spare mileage doesn’t match the mileage of the other tires. Some cars began offering the “Donut Spare” (N +1/2) where the intention was to replace the functionality of the spare, but only for a short time (with reduced speeds) until the main tire could be replaced or repaired. Some cars were fitted with a tube with a solution to temporarily repair the flat tire to allow the driver to drive the car to a safe place to get repaired. Now, some car companies have totally gotten rid of a spare (N+0) by increasing the reliability of the primary tire with “Run Flat” where you can drive 50 miles on a flat with reduced speeds. The reliability of the primary functions has been increased to levels where people are comfortable tolerating the risk associated with flying or driving without additional redundancy.

Applying these lessons to the utility industry

Why can’t these lessons be applied to the utility industries, like water, electric and waste water, so there is a focus on increasing the reliability of the primary functions and managing the consequences and associated risk of failure instead of always avoiding risk through relying on redundancy?

Utilities have to adopt more modern thinking to work with the communities they serve by managing risk and adopting different service level expectations. The airline industry is achieving huge savings by changing the N+1 thinking while increasing reliability and reducing risk.

Richard W. Taylor, a Boeing vice president and a former test pilot once said: ''There have been 30 years of progress since the current rules were written, you no longer have to have an airport in the shadow of your wing tip every place you fly.''

If the airline industry can achieve this, every industry with lesser consequences should be able to move in the same direction with much less fear of catastrophic failures. Managing the risk vs. avoiding the risk allows for leaner designs, saving money and valuable resources.

    • Related Articles

    • Large utility company uses RCM3 to save money and time – even in a pandemic

      When a large electric generation and distribution utility company in 2017 needed Reliability Centered Maintenance training and analyses on critical systems and assets, they turned to Aladon for support. Challenge: Critical assets for a large utility ...
    • Why the Original Model of the P-F Curve Is the Correct Model

      The P-F Curve is a term pioneered and used for the first time by John Moubray, the founder of Aladon. A P-F Curve is a graph that illustrates the interval between a potential failure and the functional failure of a physical asset. Today, many ...
    • How Risk Centered Spares can help solve supply chain woes

      As the world slowly recovers from strict global lockdown and travel restrictions, we continue to experience the impact of supply shortages. Many industries have been hit hard due to the short supply of critical components, with no end in sight. This ...
    • How organizations can responsibly approach spares during supply chain shortages

      That car you wanted in red is only available in black, thanks to chip shortages. Your favorite cupboard staple marinara sauce hasn’t been around in weeks. And that couch you ordered last summer? Still no sign of it. What started in March 2020 with ...
    • Why you shouldn’t streamline your RCM Process

      I am often faced with the following question by reliability engineers: "Why should we follow a classic RCM process such as RCM3 when we can achieve almost the same results with streamlined versions of RCM?" There are obvious dangers in following less ...