Runner Resource Optimization

CI/CD runner right-sizing through historical resource utilization analysis and a sustainability dashboard.

Overview

Runner Resource Optimization is a PoC feature delivered in Q1 2026 (IPCEICIS-6887) that analyses historical CPU and memory data from CI/CD pipeline executions to recommend right-sized runner configurations. Alongside it, a runner sustainability dashboard (IPCEICIS-7421) was shipped into the Forgejo runner settings page, giving users visibility into historical runner usage and current runner statuses.

The motivation is straightforward: manually choosing a runner size is guesswork. Developers tend to over-provision to avoid failures, leaving compute unused and energy wasted. By using real utilization data, the system can suggest the smallest runner that still safely completes the job.

Key Features

Historical utilization analysis: Collects CPU and memory metrics at 10-second intervals across pipeline runs and retains 30 days of data
Right-sizing recommendations: Calculates peak and average resource consumption per pipeline/job type and recommends the smallest runner size with a 20% safety margin above peak usage
Runner sustainability dashboard: Embedded in the Forgejo runner settings page — shows which runners were used in workflow jobs, historical usage trends, and current runner statuses
Workflow execution metrics collection: Gathers structured per-job metrics to feed the recommendation algorithm (IPCEICIS-7413)

Purpose in EDP

CI/CD runners are the largest variable compute cost in the EDP. Most users default to a fixed runner size regardless of actual job requirements. This feature closes that gap by:

Surfacing utilization data that would otherwise be invisible
Giving teams actionable, evidence-based recommendations without requiring deep infrastructure knowledge
Tracking runner usage per project, supporting sustainability reporting (“which runners powered my workflows?”)

How the Algorithm Works

The recommendation algorithm operates as follows for a given pipeline/job type:

Collect: Retrieve the last n runs’ CPU and memory utilization for the job
Analyse: Calculate peak and average resource consumption across those runs
Recommend: Identify the smallest runner size in the family (small → medium → large → xlarge) where peak usage fits within the available resources, plus a 20% safety margin
Output: Present current runner size vs. recommended size side-by-side

Example output:
  Job: build-and-test
  Current runner: large  (8 vCPU, 16 GB RAM)
  Peak CPU:  2.4 vCPU   Peak RAM: 5.8 GB
  Recommended: medium    (4 vCPU, 8 GB RAM)  [peak + 20% margin fits]

The recommendation is conservative by design: it does not auto-apply changes. Teams review and opt-in, avoiding surprise failures.

Runner Sustainability Dashboard

The dashboard is accessible from the Forgejo runner settings page (same permission scope as the runner list). It provides:

Panel	Description
Current runner status	Live view of idle, active, and offline runners
Historical usage by job	Which runner handled each workflow job and when
Resource utilization trends	CPU and memory over time per runner
Sustainability tracking	Per-project runner usage for carbon/energy attribution

The data is surfaced without leaving Forgejo — no external dashboarding tool is required for basic usage. For deeper observability, metrics are also exported to the EDP Grafana instance at observability.buildth.ing.

Metrics Collection (IPCEICIS-7413)

Workflow execution metrics are gathered during pipeline runs with less than 5% overhead on pipeline execution time. The collected data includes:

Job start/end timestamps
Runner identity and size
Peak and average CPU utilization (sampled at 10-second intervals)
Peak and average memory utilization
Job exit status (success/failure)

These metrics feed the recommendation algorithm and the dashboard simultaneously.

Status

Maturity: PoC — the recommendation algorithm and dashboard are functional and deployed on edp.buildth.ing. Auto-enforcement (automatic runner resizing) is explicitly out of scope for this iteration.

Additional Resources

Runners overview
GARM runner orchestration
EDP Observability
Jira: IPCEICIS-6887 (Epic), IPCEICIS-7421 (Dashboard), IPCEICIS-7413 (Metrics collection)