I highly suggest trying GLM 5.2 - pretty much on par with Opus, some argue it's better even. Atlas Knowledge's stock has 15x'd YTD. The question is how far that rubber band can stretch... If frontier intelligence can increasingly be developed in the East at a fraction of the cost incurred in the West ( GLM-5.2, was trained entirely on 100,000 Ascend 910B processors with zero Nvidia silicon), then the largest capital allocators are also the ones most exposed to over-investment risk.
The breaking point was always likely to be when one of the major spenders concludes that shareholder returns are better served by spending slightly less. The problem is that “slightly less” is not embedded in anyone’s assumptions. The entire AI complex is priced for ever rising capex as inference demand grows.
This set up is becoming increasingly precarious. The louder this message gets the more momo bottleneck bros inch toward the exit which creates neg momentum in its own right. Now we might be inclined to say who cares about GLM because it's distilled from Claude & GPT, but distillation is/was/ and always will be only one part of the equation. It's at best a bootstrapping method, the training, harness and weights over that existing distillation is still differentiated and great. If distillation is all that is required to make a model on par with the frontier labs, I don't think GLM would be the only option. But thus far, it's the only one that has come close.
Then there's the seemingly endless debate on open-source being good for hyperscalers. First, the premise is inconvertible: Jevons on volume, i.e. free ish weights collapse the price of intelligence + current elasticity = massively expanded token consumption. And in parallel, margins migrate somewhat away from frontier labs as "good-enough" OS eats the middle of the task distribution (albeit pricing will be pareto distributed anyway so wouldn't overstate this margin point). The labs currently act like demand aggregators, and this proliferates outward, ok, no debate thus far.
Second, OS weights can be served by anyone: neoclouds, sovereigns, on-prem, and maybe (probably not) the edge. So not just Hyperscalers. OS explodes TAM for everyone not just the Hyperscalers. So, for me, the shape of compute demand changes from prepaid capex commitments (from two behemoths primarily) to fragmented opex amidst a price war. So, from the Hyperscalers perspective what's changed assuming OS demand make wholes (and then some) frontier lab demand? Well, the demand risk now moves onto their balance sheet.
Earlier it was the VCs carrying (via financing the frontier labs) that demand risk. Now, that demand risk shifts onto hyperscaler b/sheet. Third, a question that's less clear to me: in an OS/OW inference dominated world, doesn't the cheapest integrated stack (Google) win? Doesn't the margin migrate from frontier labs to the Hyperscalers with lowest servicing cost to compete? So the OS = Hyperscaler bull case is really a cost leadership bet again. Thoughts?