Skip to content

API Reference

Status: Active | Version: 1.0.0 | Last Updated: 2026-05-23

This document describes every HTTP endpoint exposed by the ML Incident Response Platform's FastAPI service layer. All endpoints require a valid JWT bearer token issued by the /auth/token route unless marked [public]. Tokens expire after 3600 seconds; refresh using /auth/refresh.

Base URL (local dev): http://localhost:8000/api/v1
Base URL (staging): https://staging.mlplatform.internal/api/v1


Authentication

POST /auth/token [public]

Issues a short-lived JWT access token and a long-lived refresh token stored in an httpOnly cookie.

Request body: json { "username": "string", "password": "string" }

Response 200: json { "access_token": "eyJ...", "token_type": "bearer", "expires_in": 3600 }

Errors: - 401 Unauthorized — invalid credentials - 429 Too Many Requests — rate limit exceeded (10 attempts / 60s per IP)

POST /auth/refresh

Issues a new access token using the refresh token in the httpOnly cookie. The old refresh token is immediately invalidated and added to the Redis JWT denylist (jwt:denylist:{jti} key with TTL equal to remaining token lifetime).

Response 200: Same shape as /auth/token.

Errors: - 401 Unauthorized — refresh token absent, expired, or denylisted

POST /auth/logout

Adds the current access token's jti claim to the Redis denylist, effectively invalidating it before its natural expiry. The refresh token cookie is cleared.

Response 204 No Content


Incidents

GET /incidents

Returns a paginated list of all incidents the authenticated user has read access to.

Query parameters:

Parameter Type Default Description
page int 1 Page number (1-indexed)
page_size int 25 Results per page (max 100)
status enum all Filter by open, investigating, resolved, closed
severity enum all Filter by P1, P2, P3, P4
model_id UUID Filter to incidents for a specific model
since ISO 8601 Return incidents created after this timestamp

Response 200: json { "total": 142, "page": 1, "page_size": 25, "items": [ { "id": "550e8400-e29b-41d4-a716-446655440000", "title": "Drift detected: credit-risk-v3 PSI > 0.25", "severity": "P2", "status": "investigating", "model_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6", "created_at": "2026-05-22T14:33:00Z", "updated_at": "2026-05-23T09:12:44Z", "assigned_to": "mlops-on-call" } ] }

POST /incidents

Creates a new incident. This endpoint is also called by the drift detection pipeline's automated alerting hook.

Request body: json { "title": "string (required)", "severity": "P1 | P2 | P3 | P4 (required)", "model_id": "UUID (required)", "description": "string (optional)", "trigger_source": "manual | automated_drift | automated_performance | pagerduty", "labels": ["string"] }

Response 201 Created: Full incident object.

Errors: - 422 Unprocessable Entity — missing required fields or invalid severity - 409 Conflict — an open incident for the same model_id and severity already exists

GET /incidents/{incident_id}

Returns full detail for a single incident including timeline events and linked runbook.

Path parameter: incident_id (UUID)

Response 200: Extended incident object with timeline: [] and runbook_url: string | null.

Errors: 404 Not Found if incident does not exist or caller lacks read access.

PATCH /incidents/{incident_id}

Partially updates an incident. Accepts any subset of mutable fields: status, severity, assigned_to, description, labels.

Status transitions are validated against the state machine defined in governance.md. Invalid transitions (e.g., resolvedinvestigating) return 409 Conflict.

POST /incidents/{incident_id}/timeline

Appends a timestamped event to the incident timeline. Used by automated scripts and on-call engineers alike to maintain a chronological audit trail.

Request body: json { "event_type": "note | status_change | escalation | runbook_step_completed", "body": "string", "author": "string (defaults to JWT sub claim)" }


Models

GET /models

Returns registered ML models tracked by the platform.

Query parameters: page, page_size, team, stage (staging | production | deprecated)

GET /models/{model_id}/drift

Returns the most recent drift assessment for a model, including PSI scores per feature and the composite KS-test p-value.

Response 200: json { "model_id": "UUID", "assessed_at": "2026-05-23T08:00:00Z", "psi_composite": 0.18, "ks_p_value": 0.031, "drift_detected": true, "feature_scores": { "age": 0.04, "credit_score": 0.21, "income_band": 0.09 }, "alert_threshold_psi": 0.15 }


Error Format

All errors follow RFC 7807 Problem Details:

json { "type": "https://mlplatform.internal/errors/drift-threshold-exceeded", "title": "Drift Threshold Exceeded", "status": 422, "detail": "PSI composite score 0.28 exceeds configured threshold 0.15 for model credit-risk-v3", "instance": "/api/v1/models/3fa8.../drift" }


Rate Limits

Endpoint group Limit Window
/auth/* 10 requests 60 seconds per IP
/incidents write operations 30 requests 60 seconds per user
All other endpoints 200 requests 60 seconds per user

Rate limit headers are returned on every response: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.


Versioning

The API is versioned via URL path prefix (/api/v1). Breaking changes increment the major version. The current v1 is the only supported version. Deprecation notices will appear in response headers (Deprecation: true, Sunset: <date>) at least 90 days before a version is retired.