These speed gains are substantial. At 256K context lengths, Qwen 3.5 decodes 19 times faster than Qwen3-Max and 7.2 times ...
53mon MSN
Co-founders behind Reface and Prisma join hands to improve on-device model inference with Mirai
Mirai raised a $10 million seed to improve how AI models run on devices like smartphones and laptops.
The new lineup includes 30-billion and 105-billion parameter models; a text-to-speech model; a speech-to-text model; and a vision model to parse documents.
While the CEO did not actually name the product, something based on the Rubin architecture isn't exactly a bad guess. Rubin was first teased back in June 2024 at Computex and formally unveiled at GTC ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results