Guide

What is Colocation? A Plain-English Guide for AI and ML Teams

If you are an ML engineer, AI researcher, or technical founder, you have probably heard the term "colocation" thrown around in discussions about infrastructure. Perhaps someone suggested moving your GPU workloads out of the cloud and into a "colo." Perhaps your CTO mentioned it as a cost-saving measure. But what does colocation actually mean, and why should you care?

This guide explains colocation from the ground up, in plain English, with a focus on why it matters for AI and machine learning workloads specifically. No jargon without explanation, no assumptions about prior data centre knowledge.

What Colocation Is: The Garage Analogy

Think of colocation like renting a garage for your car. You own the car (your server hardware). The garage owner provides the space, the electricity, the security system, and keeps the building maintained. You decide what car to park there, when to drive it, and how to maintain it. The garage owner does not touch your car -- they just make sure the building is safe, powered, and climate-controlled.

In data centre terms: colocation is the practice of placing your own computing equipment -- servers, storage, networking gear -- inside a professionally operated data centre facility. The colocation provider supplies the physical space, electrical power, cooling, network connectivity, and physical security. You supply the hardware and manage everything that runs on it.

The word itself comes from "co-locate" -- your equipment is located alongside other customers' equipment in the same facility, though your space is physically separated and secured from theirs.

How Colocation Works

The mechanics are straightforward:

  1. You acquire hardware: You purchase or lease the servers, storage, and networking equipment you need. For AI workloads, this typically means GPU-accelerated servers (think NVIDIA DGX systems or custom-built GPU boxes).
  2. You choose a data centre: You select a colocation provider and facility based on location, available power, cooling capabilities, and connectivity. For GPU colocation, you specifically need a facility that can handle high power density and likely liquid cooling.
  3. Your hardware is installed: Your equipment is shipped to the data centre and installed in your designated rack space. The provider's technicians (or yours) physically mount the servers, connect power, and cable the networking.
  4. You connect to the network: Cross-connects are established to link your equipment to internet service providers, cloud provider on-ramps, or private networks. This is how your data gets in and out.
  5. You manage remotely: Day-to-day, you manage your servers remotely over the network. You SSH in, you push workloads, you monitor performance. If something needs physical attention (a failed disk, a loose cable), the data centre's staff can perform "remote hands" work on your behalf.

What You Get from a Colocation Provider

Physical Space

Your hardware lives in a rack -- a standardised metal cabinet, typically 42U tall (about 2 metres). A "U" or "rack unit" is a standard height measurement (1.75 inches / 44.45mm). A single GPU server might occupy 4-6U. You can rent a full rack, a half rack, or even individual rack units, though GPU workloads generally require at least a full rack due to power and cooling demands.

Electrical Power

The data centre delivers reliable electrical power to your rack. For traditional IT, this might be 5-10kW per rack. For AI workloads, you may need 30-50kW or more per rack. Power is delivered through redundant feeds (so a single failure does not take down your equipment), backed by uninterruptible power supplies (UPS) and diesel generators for protection against grid outages.

Cooling

Servers generate enormous amounts of heat. The data centre's cooling systems remove that heat to keep your equipment operating within safe temperature ranges. For traditional loads, this means precision air conditioning systems. For high-density GPU workloads, it increasingly means liquid cooling -- water or specialised coolant circulated through or near the servers to remove heat more efficiently than air alone.

Network Connectivity

Data centres are network hubs. Good colocation facilities are "carrier-neutral," meaning multiple internet service providers and network carriers have equipment on-site. You can establish direct physical connections (cross-connects) to whichever carriers you choose. Many facilities also connect to internet exchanges and cloud provider on-ramps, giving you low-latency access to public cloud services and the broader internet.

Physical Security

Your hardware is a significant investment. Colocation facilities provide multiple layers of physical security: perimeter fencing, 24/7 security guards or monitoring, biometric access controls, mantrap entry systems, CCTV surveillance of all areas, and individual cabinet locking. Access is logged and auditable. Only authorised personnel can access your equipment.

Colocation vs Cloud vs On-Premise

Factor Colocation Cloud On-Premise
Hardware ownership You own it Provider owns it You own it
Facility management Provider manages Provider manages You manage
Cost model CapEx + fixed OpEx Variable OpEx Heavy CapEx + OpEx
Upfront cost Moderate (hardware) None Very high (hardware + facility)
Scalability Moderate (weeks) High (minutes) Low (months)
Control Full hardware control Limited to VM/container Total control
Power/cooling Professional-grade Professional-grade Limited by building
Network quality Excellent (carrier-neutral) Excellent Limited by location
Physical security Enterprise-grade Enterprise-grade Varies
Best for Steady-state, cost-conscious Variable, fast-scaling Sovereign, regulatory

For AI teams, on-premise is rarely practical. A single rack of modern GPU servers draws 30-50kW of power -- more than most office buildings can supply to a single point, and the cooling requirements are beyond what a standard server room can handle. This is why colocation exists: you get professional-grade data centre infrastructure without building your own facility.

Why AI Companies Choose Colocation

Cost Reduction at Scale

The primary driver. Cloud GPU instances cost $2-3.50 per GPU per hour. Owning equivalent hardware in colocation costs roughly $1-1.40 per GPU per hour when hardware is amortised over 3 years. For companies running dozens or hundreds of GPUs, the annual savings fund entire engineering teams. The real-world pricing depends on your specific configuration and location, but the directional economics are consistent.

Guaranteed Availability

Your hardware is always there. You do not compete with other customers for GPU instances that may be out of stock, face preemption on spot instances, or deal with cloud provider capacity constraints. For production inference workloads with SLAs, this reliability is critical.

Hardware Flexibility

You choose exactly what hardware to deploy. Custom GPU configurations, specific CPU-to-GPU ratios, particular storage architectures, non-standard networking topologies -- all are possible when you own the hardware. Cloud providers offer a limited menu of configurations.

Data Control

Your data stays on your hardware in a known physical location. For organisations handling sensitive data, operating in regulated industries, or subject to data residency requirements, this control simplifies compliance significantly.

What to Look For in a Provider

If you are considering colocation for AI workloads, these are the critical criteria:

Common Colocation Terms: A Quick Glossary

Rack Unit (U)

The standard unit of vertical space in a server rack. One rack unit equals 1.75 inches (44.45mm). A standard rack is 42U tall. GPU servers typically occupy 4U-8U each, so a full rack might hold 5-10 GPU servers depending on the form factor.

Cross-Connect

A physical cable connection between your equipment and another party's equipment within the same data centre -- typically a network carrier, cloud provider, or internet exchange. Cross-connects have monthly recurring fees (usually £150-300 per connection).

Remote Hands

A service where the data centre's on-site technicians perform physical tasks on your behalf: rebooting servers, swapping components, checking cable connections, photographing equipment. Typically charged per incident or per hour, with some providers including a basic allowance in the monthly fee.

PUE (Power Usage Effectiveness)

A ratio that measures how efficiently a data centre uses energy. A PUE of 1.0 would mean every watt of electricity goes directly to computing (physically impossible). A PUE of 1.5 means for every 1.5 watts consumed, 1 watt powers your servers and 0.5 watts go to cooling and overhead. Lower is better. Good modern facilities target PUE of 1.2-1.3. Liquid-cooled AI facilities can achieve 1.1-1.2.

N+1 Redundancy

An infrastructure design pattern where one additional unit of capacity (power supply, cooling unit, network path) exists beyond the minimum required. If your rack needs three power feeds, N+1 means four are provided so that a single failure does not affect service. More robust designs use 2N (fully duplicated) or 2N+1 redundancy.

Understanding these terms helps you evaluate providers, read contracts, and communicate effectively with data centre teams. For more detail on how AI workloads specifically change the colocation equation, see our guide to colocation hosting for AI workloads.

See how much you could save

Upload your cloud bill to our free AI-powered audit tool and get a detailed cost breakdown in minutes.

Get Matched

New to Colocation? We Make It Simple

ColoGPU is the specialist colocation broker for AI companies. We match you with verified providers at no cost.

Get in Touch