Pearl × Together AI: Gemma-4-31B-it-Pearl ships the first 2-for-1 inference endpoint

2026-05-20 · by Lord Of Pearls

On May 15, 2026, Together AI — one of the top three serverless AI inference providers in the world — launched Gemma-4-31B-it-Pearl, an instruction-tuned checkpoint of Google's Gemma 4 31B that runs on the PEARL Proof-of-Useful-Work protocol. Five days later, Pearl Research announced the partnership publicly. This is, by some distance, the most consequential thing that has happened to PoUW since the genesis block.

If you've been watching PEARL from the outside, the headline is this: a real, paying, production hyperscaler is now selling AI inference that simultaneously mines PEARL. The thesis went from whitepaper to revenue line on a Together AI invoice in 18 months. Below is what shipped, why it's bigger than it looks, and what it means for the people in this market — miners, holders, and AI builders.

What actually shipped

Three things, in one bundle:

A new model: Gemma-4-31B-it-Pearl, an instruction-tuned checkpoint of Gemma 4 31B prepared by Pearl Research Labs. The "it" stands for instruction-tuned — meaning it's been fine-tuned to follow chat-style prompts, the way ChatGPT or Claude does, rather than just predicting the next token raw.
A production endpoint: the model is live as a serverless inference endpoint on Together AI's platform. Any developer with a Together AI account can hit it like any other LLM — same API shape, same dashboard, same billing.
A novel economic claim: Pearl calls this "2-for-1 inference." Every token the model produces both (a) does useful work for the paying customer and (b) generates PEARL coins as a byproduct, which can later be redeemed for discounted future compute. The customer pays once and receives two assets.

Pearl's own framing is worth quoting in full:

"Inference is becoming the largest compute market and energy consumer in AI. Pearl turns inference CapEx of hyperscalers into a profit center: every LLM token produced by GPUs can simultaneously generate ¶PRL in parallel. That means users get 2-for-1 economics: useful inference today, and Pearl coins that discount future compute."

— @prlnet, May 20, 2026

Why Together AI specifically is a big deal

If you're not deep in the AI infrastructure world, "Together AI" might land as just another vendor. It isn't.

Together AI is a Series B-funded inference-and-fine-tuning platform that is, alongside Fireworks and Anyscale, one of the three companies the open-source AI ecosystem actually runs on. They host every major open-weights model — Llama, DeepSeek, Mixtral, Qwen, Gemma, every variant of every variant. Major AI products you've used route through their inference endpoints under the hood. They serve billions of tokens a day.

When Together AI hosts a model, it shows up next to OpenAI, Anthropic, and Google in the dropdown menus of every AI tool in the open-source ecosystem. It gets discovered by developers building serious products, not just crypto enthusiasts.

For PEARL, this matters in three concrete ways:

Distribution: Pearl's PoUW model is now visible to every Together AI customer — that's tens of thousands of developers and AI startups who never typed "PoUW" into a search bar.
Credibility: Together AI does not host vapor. Their reputation depends on the models they list working as advertised. The fact that they integrated PEARL's protocol as the substrate for an inference endpoint means their engineering team kicked the tires and the tires are not flat.
Validation: The single biggest open question for PoUW since the chain launched has been "will any real AI buyer ever route real traffic through this?" The answer, as of this week, is yes — at hyperscaler scale.

What "2-for-1 inference" actually means economically

The phrase "2-for-1" is doing a lot of work. Let me unpack it from the buyer's perspective, because the buyer's perspective is the only one that determines whether this is a real product or a marketing slogan.

When a developer pays Together AI for one million tokens from a normal LLM (say, Llama-3.1-70B), the developer gets one thing: one million tokens. The economics are simple — pay $X, receive Y tokens.

When a developer pays Together AI for one million tokens from Gemma-4-31B-it-Pearl, the developer gets two things:

One million useful tokens — same as any other model. The output is real, usable LLM completion that the developer can plug into their product.
A claim on future discounted compute — in the form of PEARL coins generated as a byproduct of the inference. Those coins, per Pearl's design, can be spent later to obtain more inference at a discount.

If you're an AI startup buying inference today and you expect to keep buying inference forever, this is a structurally better deal than buying inference from any other vendor: you pay once, you get the tokens you need now, and you accumulate a hedge against the price of future tokens. The compute you'll need in six months is partially pre-paid by the compute you needed today.

That's the bull case. It's a real bull case. The economics resemble buying electricity and getting a meter that runs backwards in your future favor.

The honest qualifier: the value of the PEARL coins depends on PEARL's network being there in six months at a price that makes the implied discount real. That is not zero risk. But for a startup whose compute spend is already six or seven figures, even a partial hedge is meaningful.

Why this is the PoUW thesis moment

I've written before that PoUW's bear case is "what if no one buys the inference?" The chain technically still functions as PoW even if no buyer ever materializes — but the "useful" claim becomes academic, and PEARL just operates as a niche PoW chain with idiosyncratic hardware requirements.

Up until this week, that bear case was alive. Pearl had a working chain, a working mining ecosystem, and a working PoUW protocol — but no production AI buyer. The gateway architecture supported a marketplace; the marketplace itself didn't exist.

The Together AI partnership kills the bear case. The marketplace now exists. It exists at the company that already sells more LLM inference than most cloud-native AI startups combined. Every token they sell from this model is a token that funded PoUW mining as a paying customer transaction, not as a speculative reward.

This is the difference between "PoUW is possible in theory" and "PoUW is shipping in production." There is no longer a theoretical part.

What this means if you mine PEARL

Three things, ranked by what you should actually do:

1. The economic floor under PEARL just got harder

Before this week, PEARL's price was almost entirely speculative — backed by belief in the PoUW thesis and the eventual emergence of a marketplace. As of this week, there is a paying revenue line being routed through the protocol. That doesn't mean the price won't fluctuate; it means the floor is no longer hypothetical. Real economic activity is flowing through PoUW for the first time.

If you have been considering whether to keep mining through PEARL's volatility, this is the strongest "the thesis is intact" data point we've gotten.

2. Hashrate competition will increase

News like this attracts capital. Expect new pool operators, new GPU rental customers, and new dedicated mining facilities to spin up over the coming weeks. PEARL's pool landscape is currently small enough that the marginal new pool meaningfully changes the distribution; that won't last. If you've been on the fence about scaling up your rig count, the window where current hashrate levels persist is narrower than it was last week.

3. Hardware repurposability matters more than ever

The Together AI partnership is on Gemma 4 31B specifically — a 31-billion-parameter model that fits comfortably in an H100 or H200's VRAM. If you're mining on smaller hardware, your relative competitiveness is unchanged for the current network. If you're choosing between an H100 and an H200, the H200's extra VRAM headroom matters more in a world where future Pearl-flavored checkpoints may target larger models. Plan for that headroom even if you don't need it today.

What this means if you hold PEARL

The thesis-validation argument cuts both ways: it raises the credible upside and it sharpens the downside.

The credible upside is now anchored to "AI inference is a real growing market and PEARL has a foothold in it." That's a market measured in tens of billions of dollars annually and growing 50%+ year over year. PEARL doesn't need to capture even 1% of that market for the protocol's emission to be backed by real economic activity — and the Together AI partnership is the first measurable step toward that capture.

The sharpened downside is around execution. Now that there is a real product, there are real product risks: the model has to be competitive in benchmarks, the latency has to be acceptable, the price has to be lower than the all-in cost of identical inference from non-PoUW vendors. If Together AI's customers try Gemma-4-31B-it-Pearl and the experience is materially worse than its peers, the partnership becomes a footnote rather than a flywheel. The next ninety days of usage data will matter more than the announcement.

What this means if you build with AI

If you're a developer or AI startup founder, you now have a new option in your inference dropdown. Worth asking yourself:

Does your workload work on a 31B-parameter instruction-tuned model? Many do — Gemma 4's quality is competitive with the larger open-weights models for most chat, summarization, and code-assist workloads.
Do you expect to keep buying inference long-term? The "discount on future compute" component of the 2-for-1 only matters if you're a repeat buyer. One-off projects get less benefit.
Are you comfortable accumulating a crypto-denominated asset as a side effect of normal API usage? Some companies will be; some won't. The PEARL coins are real on-chain assets — they show up in a wallet, they can be sold, they have a price. That's a feature for some teams and a compliance question for others.

The honest summary: if you'd already be buying Gemma-class inference from someone, you should price-shop Gemma-4-31B-it-Pearl against the alternatives. If it's competitive on raw token price, the PEARL accumulation is upside. If it's not, no harm done — the standard Gemma checkpoint is one dropdown click away.

What hasn't been answered yet

Three real open questions remain. Honest follow-up requires saying so out loud:

What is the price? Pearl's announcement says "discounted price" but doesn't quote a number. Until the per-token price is public, the size of the 2-for-1 discount versus competing Gemma endpoints is unknown.
How does redemption work? The framing "Pearl coins that discount future compute" implies a redemption mechanism — buyers spend their accumulated PEARL to obtain inference at a lower rate. The mechanics of that mechanism haven't been documented publicly yet.
Is this exclusive? Will Gemma-4-31B-it-Pearl only be available via Together AI, or are additional inference platforms in the pipeline? The "first 2-for-1 inference endpoint" phrasing implies more are coming, but no timeline has been given.

I expect answers within weeks rather than months. The economics of a partnership like this don't work if the basic commercial terms stay opaque.

The bigger picture

Here is the part that I think gets underappreciated in the immediate news coverage of an announcement like this.

The AI infrastructure industry is on track to spend something like $400 billion on data centers in 2026 alone. The single largest cost line for almost every AI company on Earth — startup, hyperscaler, lab — is GPU compute. That cost is the thing that determines whether the unit economics of an AI product close.

For sixteen years, crypto's pitch for a piece of that economic activity has been "blockchain can mediate it" — a pitch that has, candidly, mostly failed. The crypto industry consumed a lot of compute and produced almost no AI value.

PoUW, executed correctly, inverts the pitch. Crypto stops asking for a slice of AI's spending and starts being part of AI's production. Every dollar of AI inference is also a dollar that produces blockchain security as a byproduct. The two industries become co-located on the same hardware doing the same operations.

The Together AI partnership is the first time that inversion has been demonstrated in production at scale. It may turn out to be a footnote. It may turn out to be the start of an industry restructuring. We won't know which for a year or more.

But the experiment is no longer theoretical. The data is being generated, on real GPUs, for real customers, paying real money. Anyone curious about whether PoUW is real or vaporware now has somewhere to look that isn't a whitepaper.

That's worth paying attention to.

FAQ

Can I use Gemma-4-31B-it-Pearl right now?

Yes — it's live on Together AI as a serverless inference endpoint. Any developer with a Together AI account can call it via their standard API. You don't need to know anything about PEARL or PoUW to use the model; the protocol activity happens server-side.

Do I get PEARL coins automatically when I use the endpoint?

That depends on the exact terms of the partnership, which haven't been fully published yet. Pearl's framing is that "users get 2-for-1 economics," which implies the end buyer of inference receives PEARL coins. Watch Pearl Research's channels and Together AI's pricing page for the precise mechanism.

Does this make PEARL a better investment?

It strengthens the fundamental case — real economic activity is now routing through the protocol. Whether the market has priced that in already is a separate question, and not one I'm qualified to answer. Read the broader investment analysis for the full risk picture.

Is Together AI the only inference platform hosting PoUW models?

As of May 2026, yes — they're the first. Pearl's wording ("the first 2-for-1 inference endpoint") implies others will follow. Worth watching Fireworks and Anyscale specifically; if PoUW becomes a category, those two will not stay on the sidelines.

Does this change anything for solo miners?

Not the mechanics of mining itself — block rewards are still earned the same way. What changes is the macro environment: with a hyperscaler now routing customer-paid inference through the protocol, the long-term durability of PEARL's economy is strengthened. Solo miners benefit indirectly from that durability. The mining guide remains the practical starting point.

Where can I follow the story?

@prlnet on X for Pearl Research updates. @togethercompute for Together AI's product announcements. This site (lordofpearls.xyz) for independent analysis and live network data.