We Launched a Chat Interface for Your AWS Costs — No More Cost Explorer

Isometric flat illustration of a friendly engineer chatting with an AI FinOps assistant on a laptop, a circular DEVOPS ARG gorilla badge poster on the back wall, navy and gold palette with scattered region tiles connected by gold data flows

FinOps is broken. Not because AWS doesn't give you the data — Cost Explorer, Compute Optimizer, Trusted Advisor, Cost Optimization Hub — the data is there. The problem is that nobody actually opens it.

We've audited enough engineering teams to know the pattern. The FinOps review happens once a month, usually after the bill arrives, usually driven by a VP asking uncomfortable questions. By then, whatever caused the spike is three weeks cold. Engineers don't go into Cost Explorer proactively because the interface is built for finance teams, not for the person who deployed the service.

The gap between having the information and acting on it — that's the real FinOps problem. So we built something to close it.

What We Launched: A Chat Interface for Your AWS Account

The FinOps Intelligence Platform is a Claude-powered agent with read-only access to your AWS account. You open a browser, type a question in plain English, and get an answer with real numbers from your real account — in about ten seconds.

Questions like:

  • "What's my biggest cost this week and why?"
  • "How much am I paying for NAT Gateway across all regions?"
  • "Which EC2 instances are oversized and what would I save by right-sizing them?"
  • "What anomalies showed up in the last 30 days?"
  • "Which team is spending the most, broken down by environment tag?"

The agent connects to Cost Explorer, Compute Optimizer, Cost Optimization Hub, and the EC2/RDS APIs. It correlates data across services and responds with concrete numbers and actionable recommendations — not high-level summaries, not percentage ranges, actual dollar amounts tied to specific resources.

The code is open source: github.com/devops-arg/finops-agent — MIT license, Python backend, vanilla JS frontend, docker-compose to run it.

A Quick Product Tour

Six screens do the whole job. All screenshots below use the mock fintech "Ribbon" profile that ships with the demo — no real AWS data, safe to show in any context.

1. Ask in plain English

Type a question, get a structured answer with real numbers in ~10 seconds. The right-hand panel streams the reasoning trace live — you see every tool call the agent makes before it synthesizes the final answer.

Chat interface answering 'what is our biggest cost driver this week' with Amazon EC2 at $2,240, RDS at $900, ElastiCache at $620, and the reasoning trace panel on the right showing get_current_date then query_aws_costs tool calls

2. Go deep with layered questions

Layered questions trigger multiple tool rounds. Ask "how much traffic is flowing cross-AZ through NAT Gateway instead of staying within AZ?" and the agent quantifies the impact, calculates the monthly savings from the concrete fix (PrivateLink endpoints), and shows you the work — all without you jumping between five Cost Explorer tabs.

Chat showing cross-AZ traffic and NAT Gateway cost analysis — $672 per week in NAT data transfer charges, $2,893 per month optimization opportunity, traffic pattern table and recommendation to use AWS PrivateLink for $142 weekly savings

3. Cost Overview dashboard

The landing page for the dashboard tab: last-7-days spend, monthly projection, savings-identified total, active-services count, a 4-week trend chart, service-breakdown donut, and the latest cost anomalies detected by AWS.

Cost Overview dashboard: 6500 last 7 days, 29.5k monthly projection, 4470 savings identified, 13 active services, weekly trend chart, by-service donut, Amazon GuardDuty anomaly detected

4. Services breakdown

Every active AWS service with last-week / this-week / monthly projection, plus a 4-week sparkline. Click a service to drill into a filtered chat session about just that line item.

Services breakdown grid showing EC2, RDS, ElastiCache, OpenSearch, S3, AWS Data Transfer, EKS, CloudWatch, MSK Kafka, Lambda, Route 53, Secrets Manager, WAF with weekly costs and trend sparklines

5. Live infrastructure health

EC2, RDS, EKS, ElastiCache, OpenSearch, and S3 cards with monthly cost and health indicator. Warnings are specific: "Primary at 80% CPU, downsize candidate", "6 buckets missing lifecycle policies — Intelligent-Tiering auto-moves cold data". Switch between a specific region and all regions in parallel from a dropdown.

Infrastructure Health view with EC2 at 9790 per month, RDS Databases at 5118 per month flagged for 80% CPU, EKS Clusters at 780, ElastiCache Redis at 2480, OpenSearch at 2680, S3 Storage at 1790 with bucket list and lifecycle warnings

6. Optimization recommendations

Cost Optimization Hub results ranked by monthly savings. Each card has the change, the estimated savings, a difficulty tag, and an Ask AI → button that opens the chat pre-loaded with that recommendation as context — so you can ask "why this specific instance class?" before executing.

Cost Optimizer view showing savings score 62 out of 100 and 4470 per month identified across recommendations — Downsize RDS primary db.r5.2xlarge minus 520, Add VPC endpoints for S3 ECR STS CloudWatch minus 450, 1 Year Compute Savings Plan for EKS baseline minus 870, Migrate c6i r6 workloads to Graviton minus 320

What the Agent Actually Knows How to Answer

The sidebar ships with 27 pre-built queries organized by category. Here's a sample from each:

Networking

NAT Gateway is the most common hidden cost we find in AWS audits. Teams pay the data processing fee ($0.045/GB) on every byte their pods send through NAT to reach S3, ECR, STS, or CloudWatch — and they have no idea. The agent knows how to pull this apart:

Query: "How much cross-AZ traffic is going through NAT Gateway, and which VPC endpoints am I missing?"

The agent queries Cost Explorer for NAT Gateway line items by VPC, correlates against VPC endpoint coverage, and returns something like:

Issue Monthly Cost Fix Estimated Savings
ECR image pulls via NAT $420/mo ECR VPC endpoint $400/mo
S3 transfers via NAT $310/mo S3 gateway endpoint (free) $310/mo
CloudWatch via NAT $180/mo CloudWatch interface endpoint $170/mo
Cross-AZ traffic (us-east-1a → 1b) $290/mo Topology-aware routing $200/mo

Compute

Query: "Which EC2 instances can I migrate to Graviton, and what would I save per month?"

The agent reads from Compute Optimizer and Cost Optimization Hub — the AWS service that unifies recommendations from Compute Optimizer, Trusted Advisor, and the RI/SP engine. Each recommendation comes with a concrete savings amount AWS has already computed:

Graviton migration opportunities:

  m5.2xlarge (api-server, us-east-1)   → m7g.2xlarge  saves $87/mo
  c5.xlarge  (worker-pool, eu-west-1)  → c7g.xlarge   saves $54/mo
  r5.large   (analytics, us-east-1)    → r7g.large    saves $41/mo

  Total addressable savings: $182/mo ($2,184/yr)
  Migration risk: LOW (stateless workloads, ARM64 images available)

Commitments

Query: "How much would I save with a 1-year Savings Plan based on my current usage?"

The agent pulls your Savings Plans coverage from Cost Explorer and runs the recommendation against your actual On-Demand spend pattern. It breaks down the commitment amount, the coverage percentage, and the expected net savings — not as a range, as an actual number from your account.

Anomaly Detection

Query: "What cost anomalies happened in the last 30 days and what caused them?"

Cost Explorer's anomaly detection engine flags spend events, but it doesn't explain them. The agent reads the anomaly records, correlates the timestamp against deployment history (via CloudTrail), and proposes a likely cause:

Anomaly detected: EKS compute +$1,840 on March 22-24

  Timeline:
  - 14:32 UTC: New node group added (production-ml-workloads)
  - Node group max capacity set to 20 nodes (was 5)
  - Auto-scaling triggered on March 22 at 14:45 UTC
  - 47 m5.4xlarge nodes ran for ~48h

  Likely cause: Max capacity change without corresponding HPA limits.
  Recommendation: Set explicit maxReplicas on ML workload deployments,
  review node group scaling policies.

Tag Attribution

Query: "Which team is spending the most this month, broken down by environment?"

This one only works if your tagging is consistent, but when it is, it's the question that changes behavior the fastest. When the payment-service team sees their name on the most expensive bar in the chart, they start caring about costs in a way no policy document ever produced.

Why It's Safe: Read-Only IAM by Design

The first thing you run when setting up the platform is create-read-only.sh. It takes about 30 seconds and does three things:

# What create-read-only.sh does (simplified)

# 1. Create IAM user with AWS-managed ReadOnlyAccess
aws iam create-user --user-name finops-agent-readonly --profile $ADMIN_PROFILE
aws iam attach-user-policy \
  --user-name finops-agent-readonly \
  --policy-arn arn:aws:iam::aws:policy/ReadOnlyAccess \
  --profile $ADMIN_PROFILE

# 2. Generate and store access keys
KEYS=$(aws iam create-access-key \
  --user-name finops-agent-readonly \
  --profile $ADMIN_PROFILE)
echo "AWS_ACCESS_KEY_ID=$(echo $KEYS | jq -r .AccessKey.AccessKeyId)" >> .env
echo "AWS_SECRET_ACCESS_KEY=$(echo $KEYS | jq -r .AccessKey.SecretAccessKey)" >> .env

# 3. Verify the agent CANNOT write — expects 403
RESULT=$(aws s3 mb s3://finops-readonly-verify-$(date +%s) \
  --profile finops-agent-readonly 2>&1)
if echo "$RESULT" | grep -q "AccessDenied"; then
  echo "✓ Read-only verified. Agent cannot write to your account."
else
  echo "✗ Setup failed. Expected 403, got: $RESULT"
  exit 1
fi

The IAM user gets ReadOnlyAccess — an AWS-managed policy that grants Describe*, List*, Get*, and the read methods for billing APIs. It explicitly denies every mutating action. The verification step is not optional. If the s3 mb command doesn't return a 403, the script exits with an error and does not write credentials to .env.

The agent never sees admin credentials. It connects to your account only through the keys created by this script.

Multi-Region: Because Your Costs Don't Live in One Place

A common mistake in DIY FinOps tools is scraping only us-east-1. Cost Explorer gives you global totals, but if you want to know which EC2 instances in eu-west-2 are oversized, you need to call the EC2 API in that region.

The platform scans all 18 active AWS regions in parallel when answering compute and database questions:

const ACTIVE_REGIONS = [
  'us-east-1', 'us-east-2', 'us-west-1', 'us-west-2',
  'eu-west-1', 'eu-west-2', 'eu-west-3', 'eu-central-1',
  'eu-north-1', 'ap-southeast-1', 'ap-southeast-2',
  'ap-northeast-1', 'ap-northeast-2', 'ap-northeast-3',
  'ap-south-1', 'ca-central-1', 'sa-east-1', 'me-south-1'
];

// All regions queried in parallel — total scan time typically under 8 seconds
const regionalData = await Promise.all(
  ACTIVE_REGIONS.map(region => scanRegion(region, credentials))
);

This matters most for teams that expanded regions over time and never went back to check. We regularly find forgotten EC2 instances, idle RDS snapshots, and orphaned EBS volumes in regions that were used for a project eighteen months ago.

Demo Mode: Safe for Screencasts and Pitches

Not every conversation about costs should happen against a live account. For demos, onboarding calls, and internal pitches, USE_MOCK_DATA=true in .env loads a fictional AWS account called Ribbon — a fintech with realistic spending patterns across EC2, RDS, networking, and managed services.

All 27 queries work identically in demo mode. The numbers are representative of a real mid-stage startup. No real AWS credentials involved, no risk of exposing your account structure during a screen share.

Under the Hood: Multi-Round Reasoning

Most "chat with your data" demos use a single LLM call and hope for the best. FinOps questions are layered, so we wired a multi-round agentic loop with reflection. The trace panel on the right of the chat makes it visible — you see exactly what the agent is thinking.

Reasoning trace panel: Building context, Reasoning with 14 tools available, Round 1 of 4 get_current_date, Round 2 query_aws_costs with start date 2026-04-06 end date 2026-04-14 group by service, Round 3 in progress

The engine:

  1. Gets the user query + 14 tool definitions
  2. Round 1: LLM calls tools (typically get_current_datequery_aws_costs)
  3. Reflection: engine asks "do you have enough data? If not, what next?"
  4. Rounds 2-4: additional tool calls (comparison periods, drill-downs, cross-references)
  5. Final synthesis as structured markdown

Hard cap at 4 rounds to control cost. If a tool returns nothing, the agent says so explicitly instead of hallucinating — this is the piece most LLM demos get wrong.

Getting Started: Four Commands

git clone https://github.com/devops-arg/finops-agent
cd finops-agent
cp .env.example .env
# Set ANTHROPIC_API_KEY in .env, then:
./create-read-only.sh your-admin-profile   # ~30 seconds
docker compose up --build
# Open http://localhost:3000

The platform runs entirely on your machine. Your AWS credentials and your cost data never leave your environment — the agent runs locally, calls the AWS APIs and the Anthropic API directly.

Cost Optimization Hub needs to be enabled in your AWS console before the recommendations queries will return data (Billing → Cost Optimization Hub → Enable). It's free, takes about 30 seconds to enable, and takes 24-48 hours to populate recommendations for the first time.

What We'd Do Differently

Ship the natural language layer first, not the pre-built queries. We built 27 specific queries before we finished the free-form chat, and we shipped them as the primary interface. In practice, users immediately started asking questions we hadn't anticipated. The free-form chat is where the value actually is — the 27 queries are training wheels.

Build the anomaly timeline view earlier. The anomaly detection query is the most surprising feature for users — they never expected the agent to correlate a spend spike with a deployment event. We should have prioritized this higher in the roadmap. More of the follow-up questions ("why did my bill jump on March 22") come from anomaly events than from any other trigger.

Add caching for Cost Explorer API calls. Cost Explorer has a rate limit and is not free to call at high frequency ($0.01 per API request, with the first 100K free per account per month). During development we burned through the free tier quickly. The fix is straightforward — cache Cost Explorer responses for 4-6 hours since the data doesn't change faster than that — but we should have built it from the start.


If you want to connect the FinOps Intelligence Platform to your account, or if you need a custom version integrated with your internal cost tagging model, get in touch — we'll get it running in an afternoon.

Frequently Asked Questions

Can the agent actually execute anything in my AWS account?

No — by design. The IAM user created by create-read-only.sh gets AWS-managed ReadOnlyAccess policy only. The script verifies this by attempting aws s3 mb s3://verify-readonly-test with the new credentials and expects a 403. If it gets anything other than an access denied, setup fails. The agent can describe, list, and analyze — it cannot create, modify, or delete anything.

What AWS services does it cover?

EC2, EKS, ECS, RDS, ElastiCache, Lambda, S3, CloudFront, NAT Gateway, VPC, Savings Plans, Reserved Instances, Spot Fleet, Cost Explorer, Cost Optimization Hub, Compute Optimizer, and Trusted Advisor. The multi-region scanner covers all 18 active AWS regions in parallel for compute and database resources.

Do I need to be on a specific AWS support tier for Cost Optimization Hub?

Cost Optimization Hub is available on all support tiers including the free tier. You need to enable it in the AWS console first (Billing → Cost Optimization Hub → Enable), but there's no additional cost for the service itself. The recommendations it surfaces (from Compute Optimizer, Trusted Advisor, and the RI/SP engine) are what the agent reads and explains.

How is this different from just opening Cost Explorer myself?

Three differences: (1) Speed. A Cost Explorer session to answer 'what's my NAT Gateway cost breakdown by VPC and why is it high' takes 15-20 minutes of clicking and pivoting. The agent answers in 10 seconds. (2) Correlation. The agent can combine data from Cost Explorer, Compute Optimizer, and the EC2 API in a single answer — Cost Explorer can't do cross-service correlation. (3) Recommendations with prices. Cost Optimization Hub gives the agent concrete 'migrate this instance to Graviton and save $340/month' recommendations, not just raw usage data.

What's the demo mode and when would I use it?

Demo mode loads a fictional AWS account ('Ribbon' — a fintech with realistic spending patterns across EC2, RDS, and networking). Use it for screencasts, customer demos, and internal pitches where you can't or don't want to connect a real account. All queries work exactly the same, the numbers are realistic, and no real AWS credentials are involved. Set USE_MOCK_DATA=true in your .env to activate it.