Skip to main content

Get started

This guide takes you from zero to your first verified recommendation. Three steps, and the activation event — your first accepted recommendation — is typically reached within 48 hours of connecting your first account.
TensorCost is built around three workload classes — GPU fleets, managed inference (Bedrock and friends), and agent workloads. You don’t have to connect all three to get value; most customers start with whichever one is dominating their bill.

Prerequisites

  • A TensorCost workspace. Sign up at tensorcost.com or accept your design-partner invite.
  • One of:
    • An AWS account with Bedrock usage (or CUR 2.0 enabled), or
    • An Azure subscription with Azure OpenAI usage, or
    • A GCP project with Vertex AI usage, or
    • GPU instances on AWS / Azure / GCP / Kubernetes / bare-metal that you can install the unified agent on.
  • An admin (or someone who can deploy a CloudFormation stack / Terraform module / Helm chart in your environment).

Step 1 — Create your tenant and invite your team

1

Sign up

Visit tensorcost.com or follow the design-partner invite email. Cognito-backed sign-up; SSO is enabled via your tenant admin once the tenant is provisioned.
2

Invite teammates

From the shell sidebar, open Settings → Members. Three roles ship by default:
RoleWhat they can do
memberRead dashboards, accept/dismiss recommendations, see their own team’s spend.
adminAll of member plus connect cloud accounts, manage agents, configure alert routes, set budgets.
ownerAll of admin plus billing, tenant deletion, RBAC changes.
3

Map your tags (optional but recommended)

Open Settings → Tag mapping and bind your existing AWS/Azure/GCP cost-allocation tags to the TensorCost dimensions: application, team, environment, owner. This is what powers attribution; without it, everything rolls up under “untagged.”

Step 2 — Connect your first source

Pick the path that matches what’s burning the most money first. You can layer in the others later.

Path A — Amazon Bedrock (lead managed-inference path)

This is the fastest path to a first recommendation because it requires no agent install.
1

Open the Bedrock wizard

Integrations → Add AWS Bedrock. The wizard auto-suggests an ExternalId — accept it.
2

Choose onboarding mode

SingleAccount (default — one AWS account) or Organization (consolidated billing with payer + member-account jump roles). Most early customers run SingleAccount. See bedrock integration for the multi-account variant.
3

Enable Bedrock model-invocation logging

AWS console → Bedrock → Settings → Model invocation logging → CloudWatch destination. Note the log-group ARN.
The log group must be in the same region as your InvokeModel calls. Logging in us-west-2 while your traffic runs in us-east-1 is the most common day-1 silent failure.
4

Deploy the CloudFormation stack

Click the one-click CFN link in the wizard. The stack creates exactly one IAM role (TensorCost-BedrockReader-<ExternalId>) with read-only Bedrock + CloudWatch permissions and an external-ID-bound trust policy. Nothing else.
5

Validate the connection

Paste the role ARN back into the wizard and click Validate. The wizard polls STS-AssumeRole + a sample CloudWatch read for up to two minutes.
Your dashboard backfills 90 days of CUR + CloudWatch data within 30–60 minutes. Within 48 hours, the four MVP recommenders surface routing, prompt-cache, provisioned-throughput, and runaway-loop findings with $-impact estimates.

Path B — Install the unified GPU agent

For GPU fleets running on EC2, EKS/GKE/AKS, on-prem Slurm, or Ray. Full guide in agent installation.
aws cloudformation deploy \
  --template-url https://downloads.tensorcost.com/cfn/agent-stack.yml \
  --stack-name tensorcost-agent \
  --parameter-overrides \
      TenantId=$TENANT_ID \
      ExternalId=$EXTERNAL_ID \
  --capabilities CAPABILITY_NAMED_IAM
The agent auto-detects EC2 metadata via IMDSv2, signs the gRPC handshake with HMAC-SHA256, and connects to the regional NLB on TCP/50051. Metrics start flowing within five minutes of a successful handshake.

Path C — Azure OpenAI / Vertex / OpenAI API / Anthropic API

Same pattern as Bedrock, with provider-specific credentials. Integrations → Add provider → pick the source. Each adapter ingests:
  • Per-request: model, input tokens, output tokens, latency, cache-hit rate
  • Daily billing: cost normalized to ai_spend_events
  • Tags / metadata: mapped to your application / team / environment / owner
Raw prompts and responses are never stored. Hashes only. See SOC 2 readiness for the redaction guarantee.

Step 3 — See your first recommendation

Within 48 hours of connecting your first source, the Recommendations feed populates. Each entry includes:
  • A specific, dollar-quantified change (“route 14% of customer-support-agent traffic from Claude Opus 4.6 to Haiku 4.5 — $4,200/month”).
  • The evidence (sample request IDs, cost breakdown, A/B plan).
  • Accept / dismiss-with-reason / snooze actions.
Acceptance is the activation event. Once you accept, the savings ledger starts tracking realized savings against the baseline. Verified savings populate after a 30-day window.

What to do next

Set up alert routes

Slack, PagerDuty, email, Microsoft Teams, custom webhook.

Define budget hierarchies

Tenant → team → application. Burn-rate alerts at 50%, 80%, 100%.

Connect your second source

Coverage compounds. Customers with all three workload classes see 2× the recommendations.

Wire up MCP

Query TensorCost from Claude Desktop or your own agents.

When you get stuck

  • Check the Sync history drawer on the connection — every error from STS, CloudWatch, or the IAM trust policy surfaces here with a remediation link.
  • Common day-1 failures (and remediations) are catalogued in our customer onboarding runbook.
  • Email support@tensorcost.com — design partners get a shared Slack Connect channel.