engineer using a laptop to check industrial air-cooling system

Best Practices for Improving Redundancy in Cooling Technology

Most high-tech facilities don’t fail because they lack cooling technology; they fail because cooling does not reach the servers the way it should. When teams talk about cooling redundancy, they often focus on unit counts and extra power feeds. Those pieces matter, but in real facilities, redundancy is often lost long before equipment runs out. As a supplier of data center equipment, we see this all the time. This guide explains how high-tech facilities can utilize best practices for data center cooling redundancy by:

  • Determining how many backup systems you need
  • Removing single points of failure
  • Assessing airflow integrity
  • Considering return air temperature
  • Optimizing hot and cold aisle containment
  • Choosing the right cooling technology

Real facility and data center cooling doesn’t mean you oversize your equipment; it means planning to prevent surprise downtime. To do that, you need to follow these steps:

Determine How Many Backup Systems You Need

Many operators think there’s only one way cooling systems can be redundant. In fact, there are several redundancy models cooling technology can follow. The most common models include:

  • N Design: N means you have the exact cooling capacity required. If one unit fails, all cooling fails.
  • N+1 Design: You have one extra cooling unit beyond what is needed. If one unit fails, the backup takes over.
  • 2N Design: You have two complete cooling systems. If one system fails, the second runs fully.

Choosing the right redundancy model takes careful consideration. Your facility must have enough space and power to run redundant cooling technology, and you must have the budget to purchase and maintain these redundant cooling systems. These factors can easily be overcome with expert planning, which starts by identifying the weak points in your system.

Remove Single Points of Failure

Every pump, valve, controller, and power feed matters. If one component failure stops cooling, redundancy is broken. To prevent component failures, you need to first monitor their health and address any inefficiencies as soon as they occur. Monitoring solutions can provide you with real-time insights into leak detection, environmental conditions, and more for every device connected to your cooling system. With this knowledge, you can schedule maintenance and adjust systems as needed to prevent cooling technology failure and overheating. This is especially important in facilities using air cooling systems.

Assess Airflow Integrity

Traditional cooling only works if air reaches the IT load and returns correctly. If your facility is experiencing any of the following, there’s a problem with your airflow, not your capacity:

  • Low return air temperatures despite high IT load
  • Cooling units near limits earlier than expected
  • Uneven temperature profiles
  • Little margin during single-unit failure

Data center cooling redundancy depends on airflow integrity since without it, backup cooling technology won’t operate correctly. So, when designing data center cooling redundancy, check every component for the following damage to ensure air can move as designed:

  • Sealing penetrations
  • Validating aisle containment
  • Confirming return air paths

If you don’t know where the weakest point of your cooling system is, you can’t have stable redundancy. Not to mention, the above issues will also negatively affect the quality and temperature of the air cooling your equipment, putting all your operations at risk.

Consider Return Air Temperature

Modern cooling technology is built to operate efficiently at higher return air temperatures than many facilities allow. In fact, according to ASHRAE’s 2021 Equipment Thermal Guidelines for Data Processing Environments, recommended IT intake temperatures range from 64.4 to 80.6 degrees Fahrenheit. It’s important to note, though, that allowable ranges extend higher depending on equipment class. And as return air temperature increases within limits:

  • Heat transfer improves
  • Sensible cooling capacity increases
  • Compressor lift decreases
  • System efficiency improves

To fully reap these rewards, you must first measure supply and return air temperatures as well as identify bypass and recirculation areas. This will reveal where and how your airflow is struggling, so you can make adjustments that will improve your cooling systems, like implementing hot and cold aisle containment.

Optimize Hot and Cold Aisle Containment

A facility’s main and backup cooling technology will only work as well as its airflow system. This is where hot and cold aisle containment comes in. Whether your facility already has this design implemented or not, perform the following steps to ensure optimal airflow:

  • Map out the arrangement of server racks and cooling technology
  • Identify the main sources of heat and current airflow patterns
  • Identify specific cooling requirements for your equipment
  • Plan where server racks will be arranged in alternating hot and cold aisles
  • Determine if and where physical barriers like plastic curtains or glass panels will be used
  • Check and clean filters and ducts to ensure unobstructed airflow
  • Install temperature sensors and monitoring software

No cooling system can be tuned correctly if air does not move as designed. That’s why airflow must be optimized before backup cooling equipment is added or changed.

Choose the Right Cooling Technology

Different environments require different cooling solutions. Evaluating redundancy is a perfect time to determine if your cooling technology meets your current and future needs. If your current cooling system can’t keep up with your operations, it’s time to consider other options:

  • Air cooling: Works well at lower densities but requires careful airflow control. Redundancy requires extra CRAC or CRAH units.
  • In-row cooling: Moves cooling closer to the load and improves redundancy by limiting failure zones.
  • Liquid cooling: Handles high-density workloads and offers flexible redundancy options such as implementation at the rack, row, or facility level.
  • Hybrid cooling: Combines air and liquid cooling technology for scalable and flexible redundancy. Ideal for mixed environments.

The right cooling system will depend on your load, layout, and growth plans, so be sure to take them into consideration before making a decision.

Cooling Technology Redundancy Frequently Asked Questions

What is data center cooling redundancy?

It means having backup cooling available if part of the system fails.

Why does redundancy fail in real data centers?

Poor airflow often prevents cooling units from working as designed.

Is N+1 cooling always enough?

Only if airflow supports the full capacity of each unit.

How does airflow affect cooling capacity?

Airflow losses reduce usable capacity and redundancy.

ASHRAE recommends 64.4°F to 80.6°F intake for most IT equipment.

Can raising temperatures improve redundancy?

Yes, if airflow issues are corrected first.

Why do cooling units short-cycle?

Poor airflow causes unstable operating conditions.

Does more equipment guarantee redundancy?

No, redundancy depends on system performance, not unit count.

Which cooling technology supports redundancy best?

It depends on your facility. Data center cooling needs will differ from small server room requirements. Ultimately, when engineered properly, all cooling technology have their place individually or hybrid in different environments.

When should airflow be assessed?

Before tuning controls or adding new equipment.

Looking for More Ways to Improve Data Center Cooling?

Cooling redundancy is not defined by how many units you own; it is defined by how effectively air moves through your data center. For more practical guidance on how to optimize your cooling systems, sign up for our monthly newsletter for more expert insights.

Name(Required)
Drop files here or
Accepted file types: pdf, doc, jpg, png, Max. file size: 128 MB.
    Skip to content