Billing & Cost Analysis

Track spending, set budgets, and optimize your Hyperfold costs.

Overview

The billing dashboard provides complete visibility into your Hyperfold spending. Track costs by component, set budget alerts, and get AI-powered optimization recommendations.

Cost Breakdown

# View current billing summary
$ hyperfold billing summary
BILLING SUMMARY: January 2025
PERIOD: Jan 1 - Jan 20, 2025
TOTAL SPEND                              $4,892.50
├─ LLM Inference                         $2,450.00  (50%)
├─ Agent Compute                         $1,200.00  (25%)
├─ Storage & Database                    $480.00    (10%)
├─ Network & Bandwidth                   $320.00    (7%)
├─ Integrations                          $242.50    (5%)
└─ Support & Services                    $200.00    (4%)
PROJECTED MONTH-END                      $7,645.00
BUDGET                                   $8,000.00
STATUS On track
# Detailed breakdown
$ hyperfold billing breakdown --period=mtd
COST BREAKDOWN (Month to Date)
LLM INFERENCE                            $2,450.00
  OpenAI GPT-4-Turbo        4.2M tokens  $1,680.00
  OpenAI GPT-4o             1.8M tokens  $540.00
  Embeddings                12M tokens   $230.00
AGENT COMPUTE                            $1,200.00
  sales-negotiator          142 hrs      $710.00
  fulfillment-agent         68 hrs       $340.00
  recommender-agent         30 hrs       $150.00
STORAGE                                  $480.00
  Vector Database           45 GB        $225.00
  Document Storage          120 GB       $180.00
  Logs & Analytics          50 GB        $75.00
INTEGRATIONS                             $242.50
  Shopify API calls         45,000       $90.00
  Stripe API calls          12,000       $72.00
  ShipStation               8,000        $80.50

Cost Components

ComponentWhat's Included
LLM InferenceGPT-4 tokens, embeddings, reasoning
Agent ComputeContainer runtime, CPU, memory
StorageVector DB, documents, logs
IntegrationsExternal API calls, webhooks

LLM Costs

LLM inference is typically the largest cost component. Analyze token usage to optimize spending:

# Detailed LLM usage analysis
$ hyperfold billing llm --since=7d
LLM USAGE (7 days)
TOTAL TOKENS: 8.4M                       COST: $1,120.00
BY MODEL
  GPT-4-Turbo               5.2M tokens  $832.00   (74%)
  GPT-4o                    2.4M tokens  $216.00   (19%)
  text-embedding-3-large    0.8M tokens  $72.00    (6%)
BY AGENT
  sales-negotiator          6.1M tokens  $890.00   (79%)
  recommender-agent         1.8M tokens  $180.00   (16%)
  fulfillment-agent         0.5M tokens  $50.00    (4%)
BY OPERATION
  Negotiation reasoning     4.2M tokens  $672.00
  Product search            1.5M tokens  $135.00
  Quote generation          1.2M tokens  $168.00
EFFICIENCY METRICS
  Avg tokens/session:       847
  Avg tokens/conversion:    2,541
  Cost/conversion:          $0.34
  Sessions/dollar:          3.8
# Per-session LLM costs
$ hyperfold billing llm --session=sess_abc123
SESSION: sess_abc123
  Duration:       32.5s
  Tokens:         1,247
  Cost:           $0.20
  Outcome:        conversion ($155.00)
  ROI:            775x

Budget Alerts

# Configure budget alerts
$ hyperfold billing budget set \
  --monthly=8000 \
  --alert-threshold=80
Budget configured:
  Monthly limit:    $8,000
  Alert at:         80% ($6,400)
  Current spend:    $4,892.50 (61%)
# Set component-specific budgets
$ hyperfold billing budget set \
  --component=llm \
  --monthly=3000 \
  --alert-threshold=90
# View budget status
$ hyperfold billing budget status
BUDGET STATUS
COMPONENT       BUDGET      SPENT       REMAINING   STATUS
Overall         $8,000      $4,893      $3,107 61%
LLM             $3,000      $2,450      $550 82%
Compute         $2,000      $1,200      $800 60%
# Budget alert notification settings
$ hyperfold billing budget alerts \
  --channels="slack:#finance,email:billing@company.com" \
  --frequency=daily

Cost Optimization

# Get cost optimization recommendations
$ hyperfold billing optimize
COST OPTIMIZATION RECOMMENDATIONS
1. SWITCH TO GPT-4O-MINI FOR SIMPLE TASKS
   Estimated savings: $420/month (17%)
2. ENABLE RESPONSE CACHING
   Estimated savings: $180/month (7%)
3. OPTIMIZE AGENT SCALING
   Estimated savings: $150/month (6%)
4. REDUCE EMBEDDING DIMENSIONS
   Estimated savings: $60/month (2%)
TOTAL POTENTIAL SAVINGS: $810/month (33%)
# Compare costs across periods
$ hyperfold billing compare --period1=dec --period2=jan
COST COMPARISON: December vs January
COMPONENT       DECEMBER    JANUARY     CHANGE
LLM             $2,100      $2,450      +$350 (+17%)
Compute         $980        $1,200      +$220 (+22%)
Total           $3,500      $4,130      +$630 (+18%)

Optimization Strategies

Model Selection — Use smaller, faster models for simple tasks. Route complex reasoning to GPT-4 only when needed.

Response Caching — Cache responses for semantically similar queries. Reduces token usage without affecting quality.

Prompt Optimization — Shorter, more focused prompts use fewer tokens. Review verbose system prompts for trimming opportunities.

Smart Scaling — Reduce minimum instances during off-peak hours. Use scheduled scaling for predictable traffic patterns.