Use Cases

Real-World Use Cases

See how YAVIQ delivers measurable token savings across different application types. Real numbers from real tests.

Verified Savings

Real Token Savings by Feature

All numbers are from production test results

RAG Compression

78.6%

Document-heavy applications

Large JSON

42.7%

Structured data optimization

Chat History

52.3%

Long conversations

Small JSON

25.9%

Compact structured data

Prompt Optimization

8-20%

Plain text prompts

Note: Savings depend on input size and structure. Metrics shown are real test results. Your actual savings may vary.

Use Cases

Where YAVIQ Delivers Real Value

Real-world scenarios with measurable savings

RAG-heavy SaaS

Reduce document payload size, faster responses, predictable costs

RAG compression reduces token usage while preserving semantic meaning. Perfect for document-heavy applications.

Token Savings

up to 78.6%

Key Features

  • Automatic chunking and keyword extraction
  • Parallel processing for large documents
  • Semantic compression without quality loss
  • Real-time metrics and savings tracking

Code Example

// Optimize RAG documents
const result = await client.optimizeRAG(documents, {
  mode: "balanced",
  rag_chunk_limit: 10
});

console.log(`Saved ${result.savings}% tokens`);

AI Agents

Smaller memory footprint, controlled context growth

Compress chat history and agent memory without losing critical context. Perfect for long-running agent workflows.

Token Savings

up to 52.3%

Key Features

  • Chat history compression with intent preservation
  • Agent message normalization
  • Context window management
  • Instruction-aware pruning

Code Example

// Compress chat history
const result = await client.optimizeChatHistory(messages, {
  mode: "balanced"
});

console.log(`Compressed ${messages.length} messages`);
console.log(`Savings: ${result.savings}%`);

Internal AI Tools

Predictable LLM bills, metrics & dashboards

Optimize structured data (JSON, YAML, CSV) for internal tools and APIs. Real savings, real metrics.

Token Savings

up to 42.7%

Key Features

  • Structured data compression (JSON/YAML/CSV)
  • Auto-detection of input format
  • TOON conversion (internal only)
  • Comprehensive metrics and dashboards

Code Example

// Optimize structured data
const result = await client.optimizeStructured(jsonData, {
  format: "json",
  mode: "balanced"
});

console.log(`Original: ${result.original_tokens} tokens`);
console.log(`Optimized: ${result.optimized_tokens} tokens`);
console.log(`Savings: ${result.reduction_percent}%`);

Multi-Agent Systems

Normalize agent-to-agent communication, reduce token bloat

Optimize agent context, memory, and inter-agent messages. Control token growth in complex multi-agent workflows.

Token Savings

up to 50%

Key Features

  • Agent context optimization
  • Agent message compression
  • Token budgeting across agents
  • Memory compression

Code Example

// Optimize agent context
const result = await client.optimizeAgentContext({
  messages: agentMessages,
  maxMessages: 12
});

console.log(`Context compressed: ${result.reductionPercent}%`);

Why YAVIQ

Why Choose YAVIQ Over DIY?

DIY TOON Library

  • Only handles JSON → TOON conversion (5-18% savings)
  • No RAG compression
  • No chat history compression
  • No metrics or dashboards
  • No LLM integration

YAVIQ Platform

  • RAG compression: up to 78.6% savings
  • Chat history: up to 52.3% savings
  • Structured data: up to 42.7% savings
  • Full observability: metrics, dashboards, audit logs
  • SDKs & integration: Node.js, Python, CLI, REST API

Ready to see real savings?

Get your API key and start optimizing LLM costs in minutes. Real metrics, real savings, production-ready.