Real-Time Tuning Reference
ServoBox implements a comprehensive real-time optimization stack across both host and guest systems. This page documents every tuning measure applied to achieve deterministic, low-latency performance.
Overview
ServoBox achieves real-time performance through a layered approach:
- Host-level isolation - Dedicate CPU cores and control interrupt routing
- VM configuration - Optimize KVM/QEMU for deterministic execution
- Guest optimization - PREEMPT_RT kernel with minimal jitter sources
- Verification tools - Built-in testing and diagnostics
All optimizations are applied automatically during servobox init and servobox start.
Performance Modes
ServoBox offers three RT performance modes (selected with servobox start):
| Mode | Latency Target | Power Usage | Command |
|---|---|---|---|
| Balanced (default) | avg: ~4μs, max: ~100-120μs | Normal | servobox start |
| Performance | avg: ~3μs, max: ~100μs (fewer spikes) | +20-30W | servobox start --performance |
| Extreme | avg: ~3μs, max: ~100μs (rare spikes) | +40-60W (high) | servobox start --extreme |
Mode details:
-
Balanced: Performance CPU governor with dynamic frequency scaling. Recommended for most robotics control loops - excellent latency with normal power consumption. Achieves the VM's fundamental latency floor.
-
Performance: Locks CPU frequencies to maximum, eliminating frequency transition jitter. Does not reduce max latency significantly but makes 100μs+ spikes less frequent. Use when your application needs tighter timing guarantees (99.99% vs 99.9%).
-
Extreme: Adds Turbo Boost disable for maximum determinism. In testing, shows no measurable improvement over Performance mode. Use only for experimentation or if you need absolutely predictable behavior.
VM Latency Ceiling
The ~100-120μs max latency represents the fundamental limit of running RT workloads in a VM. These spikes come from:
- Hypervisor overhead (QEMU/KVM context switches, memory management)
- Hardware interrupts (SMIs, IRQs bleeding through isolation)
- Memory subsystem (cache misses, TLB flushes)
Frequency locking (Performance/Extreme modes) can reduce the frequency of spikes but cannot eliminate them. For latencies <50μs, bare-metal RT Linux is required.
Switching Modes
You can change modes by stopping and restarting the VM with a different flag. The mode only affects host CPU tuning, not the VM image.
Host System Configuration
CPU Isolation via Kernel Parameters
What it does:
Removes CPU cores from the Linux scheduler and routes all interrupts to CPU 0, creating a "quiet zone" for RT workloads.
Configuration (user-applied):
Edit /etc/default/grub and add to GRUB_CMDLINE_LINUX_DEFAULT:
Then apply with:
Parameters explained:
isolcpus=managed_irq,domain,1-4- Remove CPUs 1-4 from kernel schedulernohz_full=1-4- Disable periodic timer ticks on isolated CPUsrcu_nocbs=1-4- Move RCU callback processing off isolated CPUsirqaffinity=0- Route all interrupts to CPU 0 by default
Why it matters:
Prevents kernel scheduler and interrupt activity from preempting RT vCPU threads, eliminating a major source of latency spikes.
Helper Command
ServoBox provides servobox irqbalance-mask to generate the correct IRQBALANCE_BANNED_CPULIST configuration for persistent IRQ isolation across reboots.
Experimental Tuning
ServoBox's balanced mode already achieves the VM latency ceiling (~100-120μs max). For users interested in experimenting with advanced BIOS and host tuning to potentially reduce spike frequency, see:
Community Research
These settings are untested with ServoBox and may not provide significant benefits in VMs. If you experiment with them and have interesting results, please share on GitHub Discussions!
Runtime IRQ Affinity Configuration
What it does:
ServoBox sets IRQ affinity for all interrupt sources to CPU 0 during servobox start.
Implementation:
Why it matters:
Ensures hardware interrupts (network, disk, USB) don't disturb isolated RT cores even if new devices are hotplugged.
CPU Frequency Governor
What it does:
Forces CPU frequency governor to performance mode on CPU 0 and all RT cores.
Why it matters:
Prevents dynamic frequency scaling (Intel SpeedStep, AMD Cool'n'Quiet) that can introduce latency spikes of 100+μs during frequency transitions.
VM/Hypervisor Configuration
vCPU to Physical CPU Pinning
What it does:
Statically pins each VM vCPU thread to a specific host CPU core.
Configuration (automatic):
libvirt XML <cputune> section with per-vCPU pinning:
<cputune>
<vcpupin vcpu='0' cpuset='1'/>
<vcpupin vcpu='1' cpuset='2'/>
<vcpupin vcpu='2' cpuset='3'/>
<vcpupin vcpu='3' cpuset='4'/>
<emulatorpin cpuset='0'/>
<iothreadpin iothread='1' cpuset='0'/>
</cputune>
Why it matters:
Eliminates scheduler migration overhead and ensures predictable cache behavior. Guest vCPU N always runs on host CPU N+1.
Real-Time Thread Priorities
What it does:
Assigns SCHED_FIFO priorities to QEMU threads using chrt.
Priority hierarchy:
| Component | Priority | Rationale |
|---|---|---|
| vCPU threads | 80 | Critical path for guest execution |
| vhost-net threads | 75 | Network I/O for robot communication |
| QEMU main process | 70 | Infrastructure/monitoring overhead |
Why it matters:
Ensures QEMU threads preempt host tasks but leave headroom (priority 90-99) for guest RT applications.
Memory Locking and KSM Disable
What it does:
Locks VM memory in physical RAM and disables Kernel Same-page Merging (KSM).
Configuration (automatic):
libvirt XML:
Why it matters:
locked: Prevents swapping (eliminates ms-scale page fault latency)nosharepages: Disables KSM scanning (prevents unpredictable CPU overhead from background merging)
Optional: Static Hugepages
For extreme performance (target <50μs max latency), you can configure host hugepages:
# Reserve 2MB hugepages (e.g., 4096 pages = 8GB for an 8GB VM)
echo 4096 | sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
# Make persistent (add to /etc/sysctl.conf):
vm.nr_hugepages=4096
Then manually add to VM XML:
<memoryBacking>
<hugepages>
<page size="2048" unit="KiB"/>
</hugepages>
<locked/>
<nosharepages/>
</memoryBacking>
Trade-off: Hugepages reduce TLB misses but require dedicated host memory reservation.
CPU Model and Cache Passthrough
What it does:
Uses host-passthrough CPU model with cache passthrough.
Configuration:
Why it matters:
Exposes host CPU features (SSE, AVX) to guest and minimizes emulation overhead. Cache passthrough reduces memory access latency.
Clock Source Configuration
What it does:
Enables kvmclock and native TSC timer in guest.
Configuration (automatic):
libvirt XML:
<clock offset='utc'>
<timer name='kvmclock' present='yes'/>
<timer name='tsc' present='yes' mode='native'/>
</clock>
Why it matters:
Provides stable, low-latency timekeeping for RT control loops. Native TSC avoids virtualization overhead.
Virtio Network Multiqueue with Halt Polling
What it does:
Enables one virtio-net queue per vCPU and configures KVM halt polling for lower idle wakeup latency.
Configuration:
--network model=virtio,driver.queues=4
# Plus runtime tuning:
echo 50000 > /sys/module/kvm/parameters/halt_poll_ns
Why it matters:
- Multiqueue: Distributes network interrupt processing across vCPUs
- Halt polling (50μs): vCPU busy-waits before sleeping, reducing wakeup latency from ~10-20μs to <5μs for network packets
Disk I/O Configuration
What it does:
Uses cache=none and discard=unmap for VM disks.
Why it matters:
Bypasses host page cache (eliminates cache flush latency) and improves SSD lifespan with TRIM support.
Guest System Configuration
PREEMPT_RT Kernel with Optimized Parameters
What it does:
ServoBox ships Ubuntu 22.04 images with linux-image-rt-amd64 (kernel 6.8.0-rt8) and RT-optimized boot parameters.
Kernel parameters (automatic):
nohpet: Disables HPET timer (reduces interrupt latency)tsc=reliable: Uses TSC as primary clocksource (lower overhead than HPET)
Why it matters:
RT patches convert most kernel spinlocks to mutexes, enabling preemption throughout the kernel. Disabling HPET eliminates a source of 10-50μs latency spikes.
Default approach:
No guest-level isolcpus parameters by default - allows multi-threaded applications to use all vCPUs freely.
Advanced Configuration
For applications requiring strict single-core isolation, users can manually add isolcpus=1-3 nohz_full=1-3 rcu_nocbs=1-3 to /etc/default/grub in the guest.
Transparent Hugepages Disabled
What it does:
Disables THP (Transparent Hugepages) in guest for deterministic behavior.
Configuration (automatic):
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
Why it matters:
THP background scanning and compaction can cause unpredictable latency spikes. We use static hugepages (configured via libvirt XML) instead for deterministic 2MB page allocation.
Guest Service Trimming
What it does:
Disables non-essential services that create scheduling noise.
Disabled services:
snapd- Package management daemonModemManager- Mobile broadband managementbluetooth- Bluetooth stackcups- Print serviceavahi-daemon- mDNS/Zeroconf
Why it matters:
Each disabled service eliminates periodic wake-ups and background CPU activity that can interfere with RT control loops.
Real-Time Process Limits
What it does:
Configures PAM limits for the @realtime group in /etc/security/limits.conf:
@realtime soft rtprio 99
@realtime hard rtprio 99
@realtime soft memlock 102400
@realtime hard memlock 102400
Why it matters:
Allows user processes to request RT priorities and lock memory without CAP_SYS_NICE or CAP_IPC_LOCK capabilities.
Fast Boot Path
What it does:
- Cloud-init runs on first boot only, then disables itself
- systemd-networkd-wait-online configured with
--any --timeout=10 - Custom
servobox-configure-macvtapservice for direct NIC setup
Why it matters:
Reduces boot time from 90+ seconds to <30 seconds. Eliminates variability from cloud-init network probing on subsequent boots.
Networking Configuration
NAT Network with DHCP Reservation
What it does:
Creates libvirt DHCP reservation for consistent VM IP (default: 192.168.122.100).
Why it matters:
Stable addressing for SSH access and client connections without manual IP configuration.
Direct NIC Attachment (macvtap)
What it does:
Optionally attaches up to 2 host NICs directly to the VM via macvtap bridge mode.
Configuration:
Why it matters:
Bypasses host networking stack for lowest-latency robot communication. Essential for dual-arm robot setups.
Guest Firewall and Routing
What it does:
- Disables
ufwfirewall - Flushes iptables rules
- Disables reverse path filtering (
rp_filter=0)
Why it matters:
Franka and other robots use UDP broadcast/multicast. Strict firewalls and rp_filter can silently drop packets, breaking robot communication.
Verification and Testing
RT Configuration Verification
Command:
Checks:
- XML configuration (CPU pinning, memory locking, timers)
- Runtime vCPU pinning and affinity
- QEMU thread RT priorities
- CPU frequency governors
- IRQ isolation statistics
- Guest kernel parameters
Latency Testing
Command:
What it does:
Runs cyclictest at 1kHz (1000μs interval) while optionally stressing the host with stress-ng.
Expected results by mode:
| Mode | Average | Max (typical) | Spike Frequency | Rating |
|---|---|---|---|---|
| Balanced | ~4μs | ~100-120μs | ~1 per 10k cycles | EXCELLENT |
| Performance | ~3μs | ~100μs | ~1 per 50k cycles | EXCELLENT |
| Extreme | ~3μs | ~100μs | ~1 per 100k cycles | EXCELLENT |
Results depend on host hardware and isolation configuration. Balanced mode is recommended for all users - it achieves the VM latency ceiling with normal power consumption. Performance/Extreme modes reduce spike frequency for applications needing 99.99% timing guarantees, but do not significantly reduce maximum latency.
Summary: Complete Optimization Stack
| Layer | Optimizations Applied |
|---|---|
| Host Kernel | CPU isolation (isolcpus, nohz_full, rcu_nocbs), IRQ affinity |
| Host Runtime (Balanced) | IRQ pinning to CPU 0, performance governor (dynamic freq), halt polling (50μs) |
| Host Runtime (Performance) | + Locked CPU frequencies to max (no scaling transitions) |
| Host Runtime (Extreme) | + Turbo Boost disabled for determinism |
| Hypervisor | vCPU pinning, SCHED_FIFO priorities, KSM off (nosharepages) |
| VM Memory | Locked, KSM disabled (nosharepages), THP disabled in guest |
| VM Hardware | host-passthrough CPU, virtio multiqueue, cache=none disks |
| Guest Kernel | PREEMPT_RT, nohpet, tsc=reliable, no guest isolcpus (default) |
| Guest Services | Trimmed services, RT limits, fast boot path |
| Guest Network | Disabled rp_filter, permissive firewall |
| Verification | rt-verify and test commands |
Related Documentation
- Installation Guide - Host GRUB configuration
- FAQ - RT kernel vs VM approach
- Troubleshooting - Performance issues and diagnostics
Additional Resources
For extreme RT requirements (<20μs worst-case at all times), consider:
- BIOS tuning: Disable C-states, P-states, Turbo Boost
- SMI analysis: Use
hwlattracer to detect firmware interrupts - Bare metal: Consider Ubuntu Pro RT kernel on dedicated hardware
ServoBox's VM approach achieves excellent RT performance (suitable for 1kHz+ robotics control) while preserving host flexibility for development, perception, and high-level processing.