Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.tensorcost.com/llms.txt

Use this file to discover all available pages before exploring further.

Configuration

This page covers the tenant-level settings that an admin or owner manages — what they configure, where, and how the configuration interacts with the rest of TensorCost.

RBAC and members

TensorCost ships three built-in roles. Custom roles are on the roadmap; today the three cover most needs.
RoleReadWriteConnect accountsManage RBACBilling
memberAll tenant dataAccept / dismiss recommendations
adminAll tenant dataAll write actionsYes
ownerAll tenant dataAll write actionsYesYesYes
Manage members under Settings → Members. SSO (SAML / OIDC) is configured per tenant under Settings → SSO and short-circuits the password flow.

Role gating in the UI and API

Both the shell and the gateway check the same @tensorcost/rbac primitive. A member who hits an admin-only REST route gets 403; the same user’s MFs hide the navigation entries they cannot use. RBAC is also enforced in the MCP server — every tool call is scope-checked at dispatch.

Tag mapping

TensorCost attribution depends on mapping your existing AWS / Azure / GCP cost-allocation tags to four canonical dimensions:
Canonical dimensionTypical source tag
applicationapp, service, Application
teamteam, Team, Owner
environmentenv, environment, Environment
ownerowner, cost_center, business_unit
Configure under Settings → Tag mapping. Drag-and-drop UI; the mapping applies to GPU agent metrics, Bedrock CUR rows, Azure OpenAI billing exports, Vertex billing, OpenAI / Anthropic API metadata, and any future managed-inference adapters. If a tag is missing on a row, the row rolls up under untagged. The savings methodology PDF (linked from the dashboard) explains how untagged is allocated when you set a “default owner” rule.

Budgets and burn-rate alerts

Budgets are hierarchical:
Tenant
└── Application
    └── Environment
Each level has independent monthly + quarterly + annual targets. Burn-rate alerts fire at 50% / 80% / 100% of the period budget, projected against the day-of-month. Alerts route through the notification channels below. Burn-rate alerting on managed inference is in flight — track its rollout via the burn-rate-alerts-enabled flag.

Notification channels

Alerts and policy events deliver through pluggable channels. Configure under Settings → Notification channels.
TypeConfigNotes
SlackIncoming webhook URLSeverity-colored blocks; threaded acknowledgments
Microsoft TeamsIncoming webhook URLMessageCard format with theme colors
PagerDutyEvents API v2 routing keySeverity mapping: critical→P1, high→P2, medium→P3
EmailRecipient listHTML; supports digest mode
Custom webhookURL + HTTP method + headersJSON body; HMAC-signed via X-TensorCost-Signature

Per-channel filtering

FilterEffect
Alert typesAllow-list of cost_threshold, idle_gpu, runaway_loop, security_incident, etc. Empty = all.
Minimum priorityDrops anything below low / medium / high / critical.
Digest modeinstant or batched. Batched flushes every digest_interval_minutes (default 30).

Test button

Every channel has Test that delivers a sample notification through the same code path as a real alert — including the signature header for webhooks.

Alert rules

Define monitoring thresholds under Alerts → Rules. Field reference:
FieldDescription
metricgpu_utilization, cpu_utilization, memory_utilization, temperature, daily_cost, hourly_cost, inference_cost_per_request, cache_hit_rate, agent_call_count, error_rate
operatorgt, lt, gte, lte, eq, not_eq
thresholdNumeric compare value
duration_minutesSustain duration before firing (0 = immediate)
severitylow, medium, high, critical
scopeall, tagged (with scope_filter), or specific_instance
notification_channel_idsWhere to deliver
cooldown_minutesPrevents alert storms
{
  "rule_name": "Bedrock cache-hit rate dropped",
  "metric": "cache_hit_rate",
  "operator": "lt",
  "threshold": 60,
  "duration_minutes": 30,
  "severity": "medium",
  "scope": "tagged",
  "scope_filter": { "tag_key": "application", "tag_value": "document-analyzer" },
  "notification_channel_ids": [3],
  "cooldown_minutes": 60
}

Enforcement policies

Automated remediation rules. Three execution modes — start in Notify only and graduate.
ModeBehavior
notify_onlyFires the alert, takes no action.
approval_requiredQueues actions for admin approval.
autoExecutes immediately. Use only for tested policies.

Templates

Pre-built policies you can clone:
TemplateWhat it does
Idle GPU auto-stopStops instances idle ≥15 minutes
Weekend cost saverScales non-prod 75% on Sat / Sun
Dev/test auto-shutdownStops dev instances at 7pm local
Training-job cost guardAborts training runs that exceed a cost threshold
Inference right-sizeSuggests downsizing for underutilized inference endpoints
Spot fallbackSwitches to on-demand on spot interruption
Runaway-loop circuit-breakerPauses an agent that crosses an invocation/cost spike threshold

Composite conditions

AND
├── gpu_utilization < 10 (for 30 min)
└── OR
    ├── daily_cost > 500
    └── monthly_budget_utilization > 80
Schedule constraints (active_hours, active_days, timezone) and maintenance-window suppression apply.

Maintenance windows

Schedule periods where alerts and enforcement are suppressed. Useful for deploys, reboots, and known-noisy events.
{
  "name": "Saturday deploy",
  "start_time": "2026-04-04T02:00:00Z",
  "end_time": "2026-04-04T04:00:00Z",
  "suppress_alerts": true,
  "suppress_enforcement": true,
  "scope": { "tags": { "environment": "production" } }
}

Branding and custom domain

Settings → Customization → Branding.
SettingDetail
Logo200×50 PNG/SVG, sidebar-rendered
Favicon16×16 / 32×32
Primary / secondary colorsOverride the MUI theme
Custom CSSAdvanced — applies after theme
ThemeLight / dark / auto
App nameBrowser tab title
Disclaimer / footerLegal disclaimer text
Custom domain — set under Customization → Domain. Provide your CNAME target, point DNS at it, wait for propagation; ACM provisions SSL automatically.

Data retention

SettingDefaultRange
Raw metric retention90 days30–365 days
Recommendation history365 days90–730 days
Audit trail7 yearsFixed (compliance)
Data older than the configured window is archived and deleted nightly. Tenant offboarding follows a 30-day soft-delete window before hard delete; audit-trail rows are preserved per the SOC 2 readiness guide.

Cloud-account configuration

AWS

We use STS AssumeRole with external ID. The CFN onboarding stack creates the role; you paste the ARN back into the wizard. Temp credentials are 15-minute, never persisted. In Organization mode, the same role assumes OrganizationAccountAccessRole (or AWSControlTowerExecution if that’s what your Landing Zone provisioned) into each member account on demand. SCP-aware: configurable role-path prefix for OUs that restrict IAM creates.

Azure

Service principal with Reader + Cost Management Reader. Use DefaultAzureCredential for local testing; in production, prefer Managed Identity on the agent host.

GCP

Service account with Compute Viewer + BigQuery Data Viewer (for billing export). Application Default Credentials are honored.

Managed-inference providers

ProviderAuthNotes
Amazon BedrockIAM role + external ID (same as AWS account)See bedrock integration
Azure OpenAIApp registration + cost API permission
Vertex AIGCP service account with Vertex + billing export read
OpenAI APIOrg-scoped API key with read-only billing scope
Anthropic APIOrg-scoped API key
All provider credentials are stored encrypted with a per-tenant KMS-derived key in integration.connection_secret (RLS-enforced). Rotation is supported via POST /v1/integration/connections/:id/rotate-secret.

Feature flags surface

Tenant-visible feature flags appear under Settings → Feature flags for admins. The full pattern — LaunchDarkly + useFeature() + the quarterly stale-flag cleanup ritual — is documented in feature flags.

Audit trail

Every config change writes to the cross-tenant audit ledger:
  • Who (user ID + email)
  • What (resource + before/after diff)
  • When (UTC timestamp)
  • Where (IP, user agent)
  • Why (free-text reason for high-severity changes)
Audit rows are immutable and exportable via GET /v1/identity/audit?format=csv.