Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tensorcost.com/llms.txt

Use this file to discover all available pages before exploring further.

Real-time events

The TensorCost backend streams live events to the dashboard, partner integrations, and any authenticated client over Socket.IO. Events cover the full lifecycle of the platform — new metrics arriving, instances changing state, alerts firing, ML training, and the runaway-loop detector pausing an agent.

Why a separate WebSocket endpoint

Our REST gateway runs behind API Gateway, which doesn’t support WebSocket upgrades on the REST product. Socket.IO traffic therefore lives on a dedicated subdomain that targets the Application Load Balancer directly.
TrafficHostnamePathPort
REST APIapi.tensorcost.com/v1/*443
WebSocketws.tensorcost.com/ws/socket.io/*443
gRPC (agents)grpc.<region>.tensorcost.com50051 (TLS)
All three share an ACM certificate with multiple SANs.

Connecting

import { io } from 'socket.io-client';

const socket = io('https://ws.tensorcost.com', {
  path: '/ws',
  auth: {
    token: 'YOUR_JWT',          // Cognito ID token or backend-issued HS256
    activeTenantId: 'tenant-...' // optional: for users with multi-tenant access
  },
  transports: ['websocket', 'polling'],
});

socket.on('event', (event) => {
  console.log(event.type, event.data);
});

Authentication

  • Cognito (RS256) — verified against the user pool’s JWKS. Resolved to a user via cognito_sub; tenant membership joins the socket to the right room.
  • Backend-issued (HS256) — verified against the shared secret; tenantId (or tenant_id) embedded directly.
Clients are only subscribed to their own tenant’s room. Cross-tenant events are unreachable by construction.

Event envelope

Every message arrives on the event channel with a uniform shape:
{
  "type": "ml.training.completed",
  "data": {
    "model_type": "gpu_anomaly",
    "duration_ms": 7512,
    "model_id": 42
  },
  "timestamp": "2026-04-29T16:41:38.412Z",
  "trace_id": "7a0db7e045..."
}
FieldDescription
typeDotted event name; the prefix is the domain (instance.*, recommendation.*, etc.)
dataEvent-specific payload
timestampISO 8601 publish time
trace_idOTel trace ID for cross-system correlation

Event catalog

PrefixExample eventsTypical use
instance.*instance.created, instance.stopped, instance.state_changedInvalidate the instances list, refresh fleet topology
metrics.*metrics.collected, metrics.processed, metrics.anomaly_detectedRefresh sparklines, flash an anomaly badge
cost.*cost.updated, cost.threshold_exceeded, cost.forecast_generated, cost.budget_breach_predictedUpdate cost charts and budget gauges
alert.*alert.created, alert.resolved, alert.escalated, alert.suppressedInbox UIs, PagerDuty bridges
agent.*agent.connected, agent.disconnected, agent.error, agent.health_checkAgent fleet health
recommendation.*recommendation.created, recommendation.accepted, recommendation.dismissedRecommendations feed
runaway_loop.*runaway_loop.detected, runaway_loop.paused, runaway_loop.resolvedAgent cost pager
inference.*inference.spike_detected, inference.cache_miss_surgeBedrock / Azure OpenAI dashboards
action.*action.queued, action.approved, action.executing, action.completed, action.rolled_backEnforcement queue
notification.*notification.requested, notification.sent, notification.failedTrace alert delivery
incident.*, security_incident.*incident.created, security_incident.escalatedIncident response timeline
spot.*spot.interruption_detected, spot.fallback_initiatedAWS spot handling
scaling.*scaling.action_triggered, scaling.completedAuto-scaling surface
ml.*ml.training.started, ml.training.completed, ml.training.failedLive retraining progress
tenant.*, policy.*, system.*HousekeepingAudit log enrichment, admin UIs

Using events in a React app

The shell pairs Socket.IO with TanStack Query (and legacy RTK Query during the migration). When the backend publishes an event, the client invalidates the right query key so any subscribed component refetches:
const PREFIX_TAG_MAP: Record<string, string[]> = {
  'instance.':       ['instance'],
  'metrics.':        ['metric'],
  'cost.':           ['cost'],
  'alert.':          ['alert'],
  'recommendation.': ['recommendation'],
  'runaway_loop.':   ['agent', 'recommendation'],
  'ml.':             ['ml-model'],
};

socket.on('event', (e) => {
  const tags = tagsForEvent(e.type);
  if (tags) queryClient.invalidateQueries({ queryKey: tags });
});
For more targeted UI (a progress banner, a toast), components can subscribe via a hook like useRunawayLoopEvents() that filters the stream and exposes only the relevant slice.

Reliability guarantees

  • In-process priority. Events dispatch to local subscribers synchronously in the publish path with no Redis dependency. A single-instance deployment works end-to-end with no broker.
  • Redis for durability (when configured). Events are also LPUSH’d onto per-type queues for cross-instance replay. A bounded LRU of recently-dispatched event IDs prevents double delivery.
  • Best-effort on Redis failure. If Redis is unreachable, events still reach all in-process subscribers (and thus the Socket.IO bridge). Only cross-instance replay is lost.
  • Per-type ordering. Events within a single type are ordered. Across types, no ordering guarantees.
  • Per-tenant rate limits. Socket.IO emit rate is capped per tenant to protect noisy-neighbor scenarios. Default 100 events/sec/tenant; raise on enterprise tier.

Webhooks

Every event in the catalog can also be delivered as a webhook (HMAC-signed X-TensorCost-Signature: t=<unix>,v1=<sig>). Configure under Integrations → Webhooks. Webhooks complement Socket.IO for systems that prefer pull-once-and-acknowledge delivery (PagerDuty, ServiceNow, internal eventing pipelines).

Debugging connection issues

The dashboard browser console prints [SocketService] connection error when the handshake fails. Common causes:
  • Wrong hostname. If the socket points at api.tensorcost.com, the upgrade is rejected. Verify VITE_WS_URL (or your env-equivalent) targets ws.tensorcost.com.
  • Certificate SAN mismatch. A new WS subdomain must be in the ALB cert’s SAN list. Reissue ACM if you added one.
  • Expired JWT. Sockets don’t auto-refresh tokens. After ~1 hour, reconnect with a fresh token. The dashboard handles this automatically on route navigation.
  • CORS origin. The backend’s CORS_ORIGIN must include the frontend origin.

MCP

TensorCost ships a built-in MCP server so Claude Desktop, internal LLM agents, and partner integrations can query the platform programmatically.

Tool surface

ToolScopeReturns
cost.summarycost:readTenant-wide totals, by-source breakdown, by-tag rollup
fleet.list_instancesgpu:readGPU fleet inventory
workload.costworkload:readPer-application / per-team / per-environment cost
inference.recommendationsai:readActive managed-inference recommendations
agent.costagent:readPer-agent and per-workflow attribution
alerts.recentalert:readRecent alerts, filterable by severity
connections.healthintegration:readSync status per cloud / inference provider
recommendations.acceptrecommendation:writeAccept a recommendation
policies.simulateenforcement:readDry-run an enforcement policy
policies.createenforcement:writeCreate / update a policy

Scope-guarded by RBAC

Every tool call is scope-checked via @tensorcost/rbac against the calling identity. Read tools are available on all plan tiers; write tools require mcp-write-tools-enabled and an explicit tenant grant.

Authenticating

MCP clients authenticate the same way as the REST API — Bearer JWT in the connection handshake. Service-to-service callers (e.g. an internal agent) use a backend-issued HS256 token with the embedded tenantId.

Use cases

  • CFO / finance — point Claude Desktop at TensorCost MCP and ask “what drove last week’s Bedrock bill?” The model uses cost.summary + inference.recommendations to compose the answer.
  • Platform engineer — wire your internal agent to MCP to assert budgets and pause runaway agents during incidents.
  • Partner integration — build a Datadog / Grafana panel that consumes cost.summary and connections.health for embedded TensorCost views.
The MCP server is configurable per tenant under Settings → MCP. The same RBAC rules apply across REST, gRPC, and MCP.