BACK

Your Test Output Is Burning Tokens: Taming Verbose Reporters for AI Agents

5 min read

Test runners like Jest and Vitest ship with reporters designed for humans watching terminals. Every file gets a line:

PASS src/components/Button.test.tsx
PASS src/components/Card.test.tsx
PASS src/components/Dialog.test.tsx
PASS src/components/Dropdown.test.tsx
PASS src/utils/format.test.ts
PASS src/utils/date.test.ts
... (200 more lines)
FAIL src/components/Nav.test.tsx
● Nav > renders active state
Expected: "active"
Received: "inactive"

For a developer at their terminal, scrolling green text reassures. But tests now run in two other contexts where that output costs more than it helps:

  • CI — nobody reads the log unless something fails. A red build forces you to scroll past hundreds of "PASS" lines to find the failure.
  • AI agents read every line of that output. Each "PASS" line consumes tokens, filling the context window with successful test output instead of the actual problem.

We hit this at Buffer. A 215-suite test run produced ~3,500 tokens of output, almost all "PASS" lines. Our AI agent spent more tokens reading test results than writing code. We tried adding --reporter=dot to our CLAUDE.md instructions, but the agent didn't always use it. The flag was a suggestion; we needed a guarantee.

Detect the Environment, Choose the Reporter

The fix: detect the environment in your test config and switch reporters automatically. No agent instructions required, no flags to remember.

Claude Code sets CLAUDECODE=1 in every shell it spawns. CI providers — GitHub Actions, GitLab CI, CircleCI, Travis CI, and Jenkins — all set CI=true. Your config reads these variables and picks the right reporter — deterministic regardless of how the agent invokes the test command.

Here's what we shipped at Buffer. CI and Claude Code each get their own reporter configuration; local development keeps the default.

For Jest, add the logic to jest.config.ts. The summary reporter prints a final count plus full details for any failures, with no per-file output. Jest's default summaryThreshold is 20, meaning it only prints failure details when more than 20 tests fail. Set it to 0 so every failure prints in full. In CI, you can pair it with a custom reporter that collects failures for GitHub PR comments:

const isCI = process.env.CI === "true";
const isClaude = process.env.CLAUDECODE === "1";
function getReporters() {
if (isCI) {
return [["summary", { summaryThreshold: 0 }], "jest-ci-reporter"];
}
if (isClaude) {
return [["summary", { summaryThreshold: 0 }]];
}
return ["default"];
}
export default {
reporters: getReporters(),
// ... rest of your config
};

For Vitest, add it to vitest.config.ts. The dot reporter compresses each file to a single character — a dot for pass, an x for fail:

import { defineConfig } from "vitest/config";
const isCI = process.env.CI === "true";
const isClaude = process.env.CLAUDECODE === "1";
function getReporters() {
if (isCI) {
return ["dot", "ci-reporter"];
}
if (isClaude) {
return ["dot"];
}
return ["default"];
}
export default defineConfig({
test: {
reporters: getReporters(),
// ... rest of your config
},
});

Both frameworks also accept reporter flags on the command line (--reporters for Jest, --reporter for Vitest). But relying on an AI agent to pass the right flag is probabilistic — the agent may forget or run a different test script that omits it. Environment variables make it deterministic.

The resulting matrix:

EnvironmentJest ReporterVitest Reporter
Local devdefaultdefault
CIsummary + CI reporterdot + CI reporter
Claude Codesummarydot

What the Agent Sees

Before (default reporter, ~250 lines):

PASS src/components/Button.test.tsx (3 suites, 12 tests)
PASS src/components/Card.test.tsx (2 suites, 8 tests)
... (200+ more PASS lines)
FAIL src/components/Nav.test.tsx
● Nav > renders active state
expect(received).toBe(expected)
Expected: "active"
Received: "inactive"
Test Suites: 1 failed, 214 passed, 215 total
Tests: 1 failed, 847 passed, 848 total

After (summary reporter, ~10 lines):

FAIL src/components/Nav.test.tsx
● Nav > renders active state
expect(received).toBe(expected)
Expected: "active"
Received: "inactive"
Test Suites: 1 failed, 214 passed, 215 total
Tests: 1 failed, 847 passed, 848 total

Same failure details, 96% less output.

When all tests pass, the gap widens further. The default reporter prints every file name — 215 lines. The summary reporter prints two:

Test Suites: 215 passed, 215 total
Tests: 848 passed, 848 total

Trade-offs

You lose progress feedback. The summary reporter stays silent until the suite finishes. For long-running suites, the agent sees nothing until completion. In practice this has not mattered — AI agents do not need reassurance that the process is running.

Debugging intermittent failures gets harder. The default reporter's per-file timing helps identify slow or flaky tests. Use the verbose reporter when investigating flakiness.

The Same Fix Applies to Linters and Build Logs

Test reporters are one interface between your tools and whatever reads the output. Linters and type checkers have the same problem. Anywhere a tool produces verbose output that an AI agent consumes, you can detect the environment and switch to a compact format. Check your test config — if process.env.CI is set and you're still using the default reporter, you're paying for output nobody reads.