May 24, 2026

Designing a ten-phone Android farm: the constraints we're optimizing for

We are scaling DeviceRent to ten phones in a single rack. This is the design notebook — every decision, every constraint, and why it had to be a Raspberry Pi 5 controller.

DeviceRent's private beta runs on a small handful of phones. The next chunk of work is a single rack with ten devices: six Pixel 7a, four Galaxy S23, all controlled from one Raspberry Pi 5. This post is the design notebook — the constraints we are working against and the choices we have already made.

The constraints

Before any architecture, the boundary conditions.

Power. Ten phones, charging and discharging continuously, plus a Pi, plus a powered USB hub, plus an active network drop. We need a circuit that does not trip when every phone happens to be at 20% at the same moment.
Thermal. Phones throttle when their SoC hits ~70°C. A phone wedged into a rack with no airflow will hit that within an hour of any sustained workload. We need passive airflow good enough that we never have to think about it.
USB stability. USB is the single largest source of failure in any device farm. Cables go bad, hubs brown out under load, and Android's USB stack drops the connection if voltage sags below ~4.5V. We have decided to over-spec the hub by 2x and to use individually-replaceable braided cables, on the theory that the cheap fix is paying $4 per cable.
Network. Ten phones plus the Pi plus an OOB management interface is eleven IPs on one switch port. We are giving them a VLAN of their own — both for ARP hygiene and so that a misbehaving phone cannot poison the rest of the LAN.
Wipe-between-sessions. Every rental ends with a hard wipe. Not a "log out, clear app data" wipe — a factory-reset wipe. The phones are enrolled in Android Enterprise device-owner mode, which gives a DPC app the right to call wipeData programmatically. The wipe takes about 90 seconds. Customers do not see the wipe; we just take the device offline for two minutes between sessions.

Why the Pi 5

We looked at three options for the controller:

Option	Pros	Cons
x86 mini-PC	Faster CPU, more RAM, runs anything	$200–400, power-hungry, overkill
Pi 4	Cheap, plenty of GPIO	USB 3 is on the same SoC as everything else, bottleneck
Pi 5	Cheap, USB 3 on its own controller, PCIe lane available	Newer, slightly fewer third-party hats

The Pi 5 won on one specific number: its USB 3 controller is separate from the rest of the SoC. On a Pi 4, sustained USB throughput across multiple devices saturates the shared bus and ADB sessions get jittery. On the Pi 5, you can talk to ten phones simultaneously without the controller becoming the bottleneck.

We also liked that the Pi 5 has a PCIe lane. We are not using it in the first build, but the option to add NVMe later for logging and snapshotting is worth keeping.

What runs on the Pi

The Pi runs three processes:

Tailscale. This is the only thing that exposes the rack to anything outside the local switch. ADB ports are bound to localhost on the Pi and reached through the tailnet only. No phone is ever directly addressable on the public internet.
An agent. A small Go binary that talks to our coordinator over WebSocket. It receives commands ("wipe device 4", "list connected phones", "restart adbd on device 7") and reports heartbeats. The agent is the only thing that can touch the phones, which keeps the trust boundary tiny.
A scrcpy bridge. Optional, lazy-started. When a customer wants screen mirroring in the browser, the agent spawns a scrcpy process that pipes the frames over the WebSocket. Most customers will never use this — they want the shell, not the screen — but having it is part of the product.

That is it. No Docker, no Kubernetes, no service mesh. The Pi is a bastion host with one job.

The wipe-between-sessions detail

This is the part that took the most thinking. The naive approach is "trigger a factory reset, wait 90 seconds, mark the device available." The problem is that during the 90 seconds the phone is offline, and if anything goes wrong in the reset path — Android's wipe occasionally hangs at 1% for reasons that nobody on the open web has been able to explain — you do not find out until a customer tries to rent the device and gets an error.

So we are building the wipe path as a state machine, not a fire-and-forget:

in_use → wipe_requested → wiping → verifying → available
                              ↓
                          wipe_failed → needs_intervention

The verifying step boots the phone, waits for ADB to come back, and confirms a known marker file is gone. If anything is off, the device goes into needs_intervention and we get paged. Customers never see a half-wiped phone.

What is not in scope yet

We are explicitly not building any of this on the first rack:

iOS support. Different protocol, different cabling, a whole different conversation.
A second rack in a different geography. We are one location until the first rack is fully utilized.
Customer-installed certificates, customer VPN ingress, customer-uploaded AVDs. All have asked-for moments. None of them is more important than just having more phones available right now.

What is next

We will write more posts as the rack comes up: the cabling photos, the thermal numbers under sustained load, the failure modes the wipe state machine has caught. Building in public is the cheapest marketing we know how to do, and it forces us to make decisions we can defend.

If you want to be on the device when the rack goes live, the waitlist is open.

All posts