Recently, I read several articles about the new updates in Claude Sonnet 3.7. They covered improvements in output quality, reasoning, and more.
However, one truly groundbreaking update was overlooked: the increased OUTPUT token length.
Here is the official announcement:
128K output tokens (beta): Claude 3.7 Sonnet supports up to 128K output tokens—over 15x longer than before. This is particularly valuable for rich code generation and detailed content creation. Support for 128K output tokens is generally available with batch processing and in beta with the Messages API.
This is huge.
In Simple Terms
When you interact with any large language model, you may notice that the longer the expected output, the more the model tends to get "lazy." For example, if you ask a model to summarize a PDF book (yes, nowadays they can read entire books) into at least, say, 10,000 words, it will provide a summary—but the output usually won’t come anywhere near that length. The simple reason is that these models aren’t trained to generate such long outputs. Some models can do this after fine-tuning, but foundational models like GPT or Gemini always hit a wall maybe at 2.000 words or so.
Claude has now changed this game.
Despite the impressive application of this update, always keep in mind that output tokens (for commercial models) cost significantly more than input tokens (the prompt). For Claude Sonnet, we’re talking about a factor of 5 (roughly $3 for input tokens versus $15 for output tokens) if you use the API.
Getting to the Fun Part: A Practical Application
One exciting application that might resonate with many of us is creating an entire book using AI—with just a single prompt.
No special tools, no subscriptions, just pure chatting with Claude AI.
I simply asked Claude to make a 70,000 word book without any fancy prompting.
The results were stunning. It produced far more output in one go than ever before, reaching a token count equivalent to more than 10,000 words.
You If you are curious about the contents of this book, you can check it out for here for free.
However, the book wasn’t finished when it stopped. I asked Claude to continue, and it did so by modifying the output to effectively continue from the previous prompt.
After a while, I stopped the process myself to avoid overusing my subscription and hitting a limit—I need Claude for other projects too!
I haven’t tried it with the API yet, but I expect that the API limits will be much closer to the promised number of tokens announced by Anthropic. I’ll keep you updated as soon as I have more news.
Now forget about those fancy ideas of selling your own ebook. Imagine having AI write a whole book about something you really like, something you could listen to on a road trip or read on the beach. Wouldn't that be nice?
The Implications
There are many applications now made easier by this update—in content generation, coding, and even creating synthetic data.
One fundamental change I must mention, however, is the threat it poses to services that have built their offerings around the limitations of the number of output tokens (by wrapping multiple inputs into one large piece of content). As with previous features, unfortunately many of these companies may soon have to pivot.
Top comments (0)