Engineering

Claude API cost per blog post: the real token math

Claude API cost per blog post, broken down into research, draft, review, and iteration tokens, at 2026 Anthropic pricing, to budget the BYOK model.

By Mitrasish, Co-founderJul 3, 202611 min read

Claude API cost per blog post: the real token math

A blog post written through a research-draft-review-iteration pipeline on Claude Sonnet 5 costs somewhere between $0.60 and $1.10 in raw API tokens at current Anthropic pricing, before prompt caching brings it down further. That number will not show up in any comparison of Jasper, Surfer, or Byword's subscription tiers, because none of those tools bill by the token. They bill by the seat or the document quota. If you bring your own Anthropic key, the bill you actually see is metered, itemized, and nothing like a flat $49-$999/mo line item. This is the math behind that bill, stage by stage.

What a blog post actually costs in Claude API tokens

A single post is not one API call. It is a pipeline of stages, each one a separate conversation with the model, each one spending input and output tokens at different rates depending on what it is doing. Understanding the cost means breaking the post apart the way the pipeline actually runs it, not treating it as one lump "write me a blog post" prompt.

The four stages that spend tokens: research, draft, review, iteration

Research. The model searches for the topic, reads competing pages, and pulls source material into context. Every search costs $10 per 1,000 queries on top of standard token pricing, and every fetched page adds input tokens: Anthropic's own numbers put an average 10 kB web page at about 2,500 tokens and a 100 kB documentation page at about 25,000 tokens. Six or seven fetched sources for one post adds up fast, and none of it produces output text yet.

Draft. The model reads the target repo's house style, its existing posts (to match voice), and the research notes, then writes the post. This is where output tokens dominate: the 2026 industry-average blog post runs about 1,427 words, up from 1,236 in 2023, and the SEO-recommended range sits at 1,500-2,000 words, in line with Backlinko's finding that top-ranking pages average 1,447 words. That word count becomes tokens, plus markdown formatting, frontmatter, and any tool calls used to write the file.

Review. A second, independent pass reads the draft, re-fetches every external link and every cited source to confirm it says what the draft claims, and checks pricing or statistics against a live page if one is configured. This stage looks a lot like research in token shape (fetch-heavy input, light output) but starts from a bigger base, because it has to load the whole draft first. We cover the mechanics of this pass in how AI content fact-checking actually works.

Iteration. The draft goes back to the writer with specific line comments to fix. Each round reloads the draft, the repo context, and the review notes, then writes a smaller, targeted output. This is the stage with the most variance, because how many rounds a post needs is the one thing in this list that is not fixed by post length or source count.

Why the number on your Anthropic invoice isn't just "input plus output"

The naive way to estimate cost is (input tokens x input price) plus (output tokens x output price). That undercounts what actually lands on the invoice in three specific ways.

First, an agentic pipeline is multi-turn, not one call. Each turn in a tool-use loop resends the accumulated conversation, so a stage that runs eight or ten turns is not paying for the final input once, it is paying for a growing context on every turn until that stage ends. This is the single biggest reason a "four-stage" post looks cheap on paper and costs more in practice: it is really dozens of API calls, not four.

Second, tools carry their own token overhead before they do anything. A tool-use system prompt adds roughly 350-475 tokens per Sonnet 5 request depending on tool choice mode, the bash tool adds 245 input tokens per call, and the text editor tool adds 700. None of that is "the post," and all of it bills at standard input rates.

Third, the newer tokenizer used by Sonnet 5 and Opus 4.7 and later produces about 30% more tokens for the same text than the tokenizer used by Sonnet 4.6 and earlier models, according to Anthropic's pricing documentation. A lower per-token rate does not automatically mean a lower per-post cost if the same English text now counts as more tokens.

A worked example at current Sonnet 5 pricing

Here is an illustrative run for one roughly 1,700-word post through Lyra's own four-stage pipeline, using Sonnet 5's current published rates: $2 per million input tokens, $10 per million output tokens, and $10 per 1,000 web searches. These are representative totals aggregated across every turn in each stage, not a single call, since that is what a real agentic run looks like.

Stage	Approx. input tokens	Approx. output tokens	Tool fees	Stage cost
Research	60,000	4,000	~6 searches ($0.06)	~$0.22
Draft	90,000	8,000	none	~$0.26
Review	70,000	3,000	none (web fetch has no fee)	~$0.17
Iteration (1 round)	50,000	5,000	none	~$0.15
Total	270,000	20,000	$0.06	~$0.80

Two iteration rounds instead of one pushes the total to roughly $0.95. Three rounds, which is closer to what a first-draft-on-a-hard-topic post can need, lands closer to $1.10. None of these numbers include prompt caching, which is the first lever worth pulling, covered below. For scale, this sits well inside what Anthropic itself reports for agentic Claude usage: across enterprise Claude Code deployments the average cost is about $13 per developer per active day and $150-250 per month, with 90% of users spending under $30 on an active day. A single blog post pipeline run is a small fraction of what one working session already costs.

Why bring-your-own-key pricing behaves differently than a flat SaaS subscription

BYOK scales with usage on your own Anthropic invoice, not a $49-$999/mo tier

A subscription content tool sells you a tier: a fixed number of documents or seats per month for a fixed price, regardless of how many tokens the underlying model actually burns to produce them. A BYOK tool does the opposite. You connect your own Anthropic key, the tool never marks up the usage, and Anthropic bills you directly at the published per-token rate, itemized in your own console. Publish four posts in a month and you pay for four posts' worth of tokens. Publish none and you pay nothing. There is no quota to burn through or waste, because there is no quota, just usage. We laid out the reasoning behind building an AI blog writer for developers around this model rather than a bundled one.

What Byword, Jasper, and Surfer charge instead, and what's bundled into that price

None of the bulk or optimization tools price this way, because none of them are BYOK. Byword's plans start at $99/month, scaling up through higher tiers as article volume increases; we break the full tier table down, sourced, in our Byword vs Jasper comparison. Jasper's Pro plan runs $69/month monthly or $59/month billed annually for a single seat, with a custom-priced Business tier above it. Surfer SEO spans five tiers from Discovery at $49/month up to Enterprise at $999/month, priced by document and workspace limits rather than tokens; we cover that full range in Surfer SEO alternatives for teams whose blog lives in Git. (Figures are current as of this post's date and can change; check each vendor's pricing page before you buy.)

That flat price bundles more than model inference. It covers the SaaS product itself: the editor, the CMS integrations, keyword and SERP tooling, support, and the vendor's own margin on top of whatever the model actually costs them to run. "When that $15 drops to $2, the same $24 looks less like a product price and more like a markup," Ritesh Shrivastav wrote about AI products more broadly, describing exactly this dynamic: as underlying inference costs fall, a flat subscription price that was set when tokens were expensive increasingly looks like a wrapper tax rather than a fair reflection of what the product costs to run today.

The trade-off: a metered bill you can audit vs. a flat fee you can't itemize

Neither model is free of trade-offs. A metered BYOK bill is auditable down to the token: you can open your Anthropic console and see exactly what a given post cost, broken out by input, output, and cache. A flat subscription is predictable and simple to budget, one number, once a month, but you cannot separate what portion of that $99 or $299 is model spend versus product margin, and the price does not shrink when the underlying model gets cheaper. Predictability is worth something. So is knowing exactly what you are paying for.

The levers that change your cost per post

Prompt caching: cache reads cost a tenth of a fresh read

A blog pipeline resends a lot of the same context on every turn: the house-style guide, the repo's CLAUDE.md, the tool definitions, the growing draft. Prompt caching is built for exactly that pattern. Anthropic's pricing sets a 5-minute cache write at 1.25x the base input price and a 1-hour cache write at 2x, while a cache read (a hit) costs 0.1x base input for the same duration as the write it followed. That means a 5-minute cache pays for itself after a single subsequent read, and a 1-hour cache pays for itself after two reads. A four-stage pipeline that touches the same house-style and repo context repeatedly across dozens of turns is close to the ideal case for caching, because most of those repeated reads move from full input price down to a tenth of it.

Why the Batch API's 50% discount doesn't fit a live review-and-merge pipeline

The Batch API cuts both input and output token prices in half for asynchronous processing, which sounds like the obvious lever until you look at what "asynchronous" means here: submit a batch of requests, get results back later, no live back-and-forth. A blog pipeline that opens a pull request, waits for a human to look at it, and iterates on live review comments is the opposite of that shape. It needs a response now, not queued behind a batch job that might not clear for hours. The discount is real, and it fits bulk offline jobs well. It does not fit a pipeline whose whole point is a tight, interactive loop between the model and a human reviewer.

Iteration count is the biggest variable you actually control

Research, draft, and review scale with post length and source count, which do not move much post to post. Iteration count is different: it depends entirely on how clean the first draft is and how much the review pass finds wrong with it. One clean round of fixes barely moves the total. Three or four rounds, chasing down claims that keep failing verification or links that keep resolving to the wrong page, can double or triple the iteration stage's share of the bill on its own. The lever here is not a pricing setting, it is draft quality: a writer that gets voice, sourcing, and linking right the first time spends less on the rounds that clean up after it.

What this means for your monthly bill

Multiply the worked example above by a realistic monthly cadence and the gap to a flat subscription becomes obvious fast. Eight posts a month at roughly $0.80-$1.10 each lands in the $6-$9 range in raw Claude spend, before caching brings it down further, against Byword's $99/month floor, Jasper's $69/month per seat, or Surfer's $49/month starting tier. That is not a claim that BYOK tools always come out cheaper. It is a claim that the two bills measure completely different things: one is metered model usage you can read off an invoice, the other is a flat price for a bundled product where the model is one line item among many you cannot see. If you already pay for an Anthropic key, that distinction is the one worth doing the math on before comparing sticker prices.

Lyra runs on your own Anthropic key, encrypted at rest and never marked up, so the token math above is the actual bill, not a bundled subscription tier.
Talk to the founder → · Join the waitlist

FAQ

Frequently asked

How much does it cost to write a blog post with the Claude API?+

For a roughly 1,700-word post run through a four-stage pipeline (research, draft, review, and one round of iteration) on Claude Sonnet 5 at current pricing, the raw token cost lands under a dollar, often in the $0.60-$1.10 range before prompt caching. It scales with post length, how many sources get fetched, and how many review-and-fix rounds the draft needs, so a post that needs three iteration rounds instead of one can roughly double that stage's share.

Is Claude Sonnet 5 cheaper than Claude Opus for writing blog posts?+

Yes. Sonnet 5 is priced at $2 per million input tokens and $10 per million output tokens through August 31, 2026, versus $5 input and $25 output for Opus 4.8. For a drafting and review workload, Sonnet 5 costs a fifth of Opus per token on both sides of the ledger, which is why it is the default model for a blog-writing pipeline rather than a coding-heavy agent workload.

Does prompt caching actually lower the cost of an AI writing pipeline?+

Meaningfully, if the pipeline reuses the same system prompt, house-style guide, and repo context across turns. A cache read costs a tenth of the base input price, so a 5-minute cache write (1.25x base) pays for itself after a single read, and a 1-hour cache write (2x base) pays for itself after two reads. A blog pipeline that resends the same CLAUDE.md and voice guide on every turn of every stage is exactly the repeated-context pattern caching is built for.

Why is bring-your-own-key pricing different from a flat SaaS subscription like Jasper or Surfer?+

A BYOK tool bills you at Anthropic's per-token rate with no markup, so your cost moves with your usage and shows up itemized on your Anthropic invoice. A subscription tool charges a flat monthly tier, such as Surfer's $49-$999/mo or Byword's $99-$999/mo, that bundles the model cost with the SaaS product (CMS integrations, keyword tooling, support) into one number you cannot separate. BYOK is metered and auditable; a subscription is fixed and opaque about what share of it is model spend.

Built by the tool you're reading about

This post is the kind of thing Lyra ships on her own.

Lyra finds the topics worth ranking for, writes them in your repo's voice, fact-checks every claim, and opens a pull request scored and ready to merge. You review and hit merge. Want to see what she'd write for you? Tell us about your blog and the founder will walk through it with you.

Talk to the founder Join the waitlist

Claude API CostAnthropic API PricingCost Per Blog Post AIBYOK AI WritingClaude Sonnet Pricing

Keep reading

Engineering11 min read

GitHub App permissions: what to check before you connect

GitHub App permissions decide what an AI writer can touch in your repo: which scopes to grant, which to refuse, and how to audit or revoke access.

Jul 3, 2026Read →

Engineering11 min read

The editorial review process for AI content's E-E-A-T

A concrete editorial review process for AI content: grounded sourcing, separated writer/checker roles, verified links, and a pre-publish check for E-E-A-T.

Jul 1, 2026Read →

Engineering15 min read

GitHub Actions SEO: gate PRs on broken links and schema

GitHub Actions SEO checks for blog PRs: four automated jobs that catch broken links, bad canonicals, invalid JSON-LD, and image-driven Core Web Vitals failures.

Jul 1, 2026Read →