ADR-0002: Device ID Discovery Strategy for Device Update Flow

Status: Accepted
Date: 2025-11-16
Authors: Dave Emmanuel Magno
Supersedes: None
Superseded by: None

Context

Devices must call the manager’s update-status endpoint:

  • GET /api/devices/versions/devices/{device_id}/update-status

Each device therefore needs to know its own device_id (DB primary key). We want:

  • No per-device .env edits for device_id.
  • Simple re-imaging / re-registration (device can rediscover its ID).
  • Minimal changes to existing modules (devices, versions, dispatcher).

Today:

  • Devices expose a stable system_name in GET /api/version/.
  • The manager stores system_name and env on each device.
  • Devices authenticate using a shared AGENT-level credential (no per-device identity yet).

Decision

1. Manager-side lookup endpoint

Introduce a lookup API to resolve a device from its system_name:

  • Method: GET
  • URL: /api/devices/lookup
  • Query params:
  • system_name (required)
  • env (optional; development|staging|production|testing)
  • Behavior:
  • 0 matches → HTTP 404 "Device not found".
  • 1 matches and no env → HTTP 409 "Multiple devices matched system_name; specify env".

  • 1 matches even with env → HTTP 409 "Multiple devices matched system_name and env".

  • Exactly 1 match → return DeviceResponse including id.

2. Device-side discovery and caching

Device behavior:

  1. On startup (or first need), call:

  2. GET {MANAGER_BASE_URL}/api/devices/lookup?system_name={system_name}&env={env}

  3. On 200:

  4. Read id from response and cache it locally (e.g. small file).

  5. On subsequent calls:

  6. Use cached device_id for GET /api/devices/versions/devices/{device_id}/update-status.

  7. On 404/409:

  8. Log a clear error for operators (device not registered or ambiguous configuration).

3. Security note (known limitation)

  • The lookup and update-status endpoints currently rely on shared AGENT credentials.
  • A compromised device could attempt to look up or fetch status for other devices by guessing system_name / device_id.
  • We accept this as a temporary limitation and track a follow-up to:
  • Introduce per-device identity (token/API key or Tailscale/mTLS).
  • Enforce “device can only access its own record” semantics.
  • Potentially replace /api/devices/lookup for device usage with a /api/devices/me endpoint that derives identity from credentials.

Trade-offs

Pros

  • No manual per-device device_id configuration.
  • Devices can recover their ID after re-imaging or re-registration.
  • Reuses existing system_name and env fields; no schema change.
  • Clear failure semantics (404 = not registered, 409 = ambiguous).

Cons

  • Does not yet provide per-device authorization; still role-based via AGENT credential.
  • Adds one extra HTTP call per device (only on first run or when cache invalid).

Consequences

  • Devices discover and cache device_id via /api/devices/lookup instead of reading it from .env.
  • Update-status flow (/api/devices/versions/devices/{device_id}/update-status) remains unchanged.
  • A separate security hardening effort will:
  • Introduce per-device identity.
  • Restrict device-facing endpoints to self-only access.
  • Potentially deprecate device use of /api/devices/lookup in favor of a credential-bound /devices/me endpoint.