Documentation Index
Fetch the complete documentation index at: https://docs.tensorcost.com/llms.txt
Use this file to discover all available pages before exploring further.
Configuration
This page covers the tenant-level settings that an admin or owner manages — what they configure, where, and how the configuration interacts with the rest of TensorCost.
RBAC and members
TensorCost ships three built-in roles. Custom roles are on the roadmap; today the three cover most needs.
| Role | Read | Write | Connect accounts | Manage RBAC | Billing |
|---|
member | All tenant data | Accept / dismiss recommendations | — | — | — |
admin | All tenant data | All write actions | Yes | — | — |
owner | All tenant data | All write actions | Yes | Yes | Yes |
Manage members under Settings → Members. SSO (SAML / OIDC) is configured per tenant under Settings → SSO and short-circuits the password flow.
Role gating in the UI and API
Both the shell and the gateway check the same @tensorcost/rbac primitive. A member who hits an admin-only REST route gets 403; the same user’s MFs hide the navigation entries they cannot use. RBAC is also enforced in the MCP server — every tool call is scope-checked at dispatch.
Tag mapping
TensorCost attribution depends on mapping your existing AWS / Azure / GCP cost-allocation tags to four canonical dimensions:
| Canonical dimension | Typical source tag |
|---|
application | app, service, Application |
team | team, Team, Owner |
environment | env, environment, Environment |
owner | owner, cost_center, business_unit |
Configure under Settings → Tag mapping. Drag-and-drop UI; the mapping applies to GPU agent metrics, Bedrock CUR rows, Azure OpenAI billing exports, Vertex billing, OpenAI / Anthropic API metadata, and any future managed-inference adapters.
If a tag is missing on a row, the row rolls up under untagged. The savings methodology PDF (linked from the dashboard) explains how untagged is allocated when you set a “default owner” rule.
Budgets and burn-rate alerts
Budgets are hierarchical:
Tenant
└── Application
└── Environment
Each level has independent monthly + quarterly + annual targets. Burn-rate alerts fire at 50% / 80% / 100% of the period budget, projected against the day-of-month. Alerts route through the notification channels below.
Burn-rate alerting on managed inference is in flight — track its rollout via the burn-rate-alerts-enabled flag.
Notification channels
Alerts and policy events deliver through pluggable channels. Configure under Settings → Notification channels.
| Type | Config | Notes |
|---|
| Slack | Incoming webhook URL | Severity-colored blocks; threaded acknowledgments |
| Microsoft Teams | Incoming webhook URL | MessageCard format with theme colors |
| PagerDuty | Events API v2 routing key | Severity mapping: critical→P1, high→P2, medium→P3 |
| Email | Recipient list | HTML; supports digest mode |
| Custom webhook | URL + HTTP method + headers | JSON body; HMAC-signed via X-TensorCost-Signature |
Per-channel filtering
| Filter | Effect |
|---|
| Alert types | Allow-list of cost_threshold, idle_gpu, runaway_loop, security_incident, etc. Empty = all. |
| Minimum priority | Drops anything below low / medium / high / critical. |
| Digest mode | instant or batched. Batched flushes every digest_interval_minutes (default 30). |
Every channel has Test that delivers a sample notification through the same code path as a real alert — including the signature header for webhooks.
Alert rules
Define monitoring thresholds under Alerts → Rules. Field reference:
| Field | Description |
|---|
metric | gpu_utilization, cpu_utilization, memory_utilization, temperature, daily_cost, hourly_cost, inference_cost_per_request, cache_hit_rate, agent_call_count, error_rate |
operator | gt, lt, gte, lte, eq, not_eq |
threshold | Numeric compare value |
duration_minutes | Sustain duration before firing (0 = immediate) |
severity | low, medium, high, critical |
scope | all, tagged (with scope_filter), or specific_instance |
notification_channel_ids | Where to deliver |
cooldown_minutes | Prevents alert storms |
{
"rule_name": "Bedrock cache-hit rate dropped",
"metric": "cache_hit_rate",
"operator": "lt",
"threshold": 60,
"duration_minutes": 30,
"severity": "medium",
"scope": "tagged",
"scope_filter": { "tag_key": "application", "tag_value": "document-analyzer" },
"notification_channel_ids": [3],
"cooldown_minutes": 60
}
Enforcement policies
Automated remediation rules. Three execution modes — start in Notify only and graduate.
| Mode | Behavior |
|---|
notify_only | Fires the alert, takes no action. |
approval_required | Queues actions for admin approval. |
auto | Executes immediately. Use only for tested policies. |
Templates
Pre-built policies you can clone:
| Template | What it does |
|---|
| Idle GPU auto-stop | Stops instances idle ≥15 minutes |
| Weekend cost saver | Scales non-prod 75% on Sat / Sun |
| Dev/test auto-shutdown | Stops dev instances at 7pm local |
| Training-job cost guard | Aborts training runs that exceed a cost threshold |
| Inference right-size | Suggests downsizing for underutilized inference endpoints |
| Spot fallback | Switches to on-demand on spot interruption |
| Runaway-loop circuit-breaker | Pauses an agent that crosses an invocation/cost spike threshold |
Composite conditions
AND
├── gpu_utilization < 10 (for 30 min)
└── OR
├── daily_cost > 500
└── monthly_budget_utilization > 80
Schedule constraints (active_hours, active_days, timezone) and maintenance-window suppression apply.
Maintenance windows
Schedule periods where alerts and enforcement are suppressed. Useful for deploys, reboots, and known-noisy events.
{
"name": "Saturday deploy",
"start_time": "2026-04-04T02:00:00Z",
"end_time": "2026-04-04T04:00:00Z",
"suppress_alerts": true,
"suppress_enforcement": true,
"scope": { "tags": { "environment": "production" } }
}
Branding and custom domain
Settings → Customization → Branding.
| Setting | Detail |
|---|
| Logo | 200×50 PNG/SVG, sidebar-rendered |
| Favicon | 16×16 / 32×32 |
| Primary / secondary colors | Override the MUI theme |
| Custom CSS | Advanced — applies after theme |
| Theme | Light / dark / auto |
| App name | Browser tab title |
| Disclaimer / footer | Legal disclaimer text |
Custom domain — set under Customization → Domain. Provide your CNAME target, point DNS at it, wait for propagation; ACM provisions SSL automatically.
Data retention
| Setting | Default | Range |
|---|
| Raw metric retention | 90 days | 30–365 days |
| Recommendation history | 365 days | 90–730 days |
| Audit trail | 7 years | Fixed (compliance) |
Data older than the configured window is archived and deleted nightly. Tenant offboarding follows a 30-day soft-delete window before hard delete; audit-trail rows are preserved per the SOC 2 readiness guide.
Cloud-account configuration
AWS
We use STS AssumeRole with external ID. The CFN onboarding stack creates the role; you paste the ARN back into the wizard. Temp credentials are 15-minute, never persisted.
In Organization mode, the same role assumes OrganizationAccountAccessRole (or AWSControlTowerExecution if that’s what your Landing Zone provisioned) into each member account on demand. SCP-aware: configurable role-path prefix for OUs that restrict IAM creates.
Azure
Service principal with Reader + Cost Management Reader. Use DefaultAzureCredential for local testing; in production, prefer Managed Identity on the agent host.
GCP
Service account with Compute Viewer + BigQuery Data Viewer (for billing export). Application Default Credentials are honored.
Managed-inference providers
| Provider | Auth | Notes |
|---|
| Amazon Bedrock | IAM role + external ID (same as AWS account) | See bedrock integration |
| Azure OpenAI | App registration + cost API permission | |
| Vertex AI | GCP service account with Vertex + billing export read | |
| OpenAI API | Org-scoped API key with read-only billing scope | |
| Anthropic API | Org-scoped API key | |
All provider credentials are stored encrypted with a per-tenant KMS-derived key in integration.connection_secret (RLS-enforced). Rotation is supported via POST /v1/integration/connections/:id/rotate-secret.
Feature flags surface
Tenant-visible feature flags appear under Settings → Feature flags for admins. The full pattern — LaunchDarkly + useFeature() + the quarterly stale-flag cleanup ritual — is documented in feature flags.
Audit trail
Every config change writes to the cross-tenant audit ledger:
- Who (user ID + email)
- What (resource + before/after diff)
- When (UTC timestamp)
- Where (IP, user agent)
- Why (free-text reason for high-severity changes)
Audit rows are immutable and exportable via GET /v1/identity/audit?format=csv.