Designed to improve robots’ reasoning, the Rho-alpha vision-language-action model marks Microsoft’s offering in the growing field of physical AI.
Agentic Vision combines visual reasoning with code execution to ground answers in visual evidence, delivering a 5% to 10% ...
On the Humanity’s Last Exam (HLE) benchmark, Kimi K2.5 scored 50.2% (with tools), surpassing OpenAI’s GPT-5.2 (xhigh) and ...
Physical AI marks a transition from robots as programmed tools to robots as adaptable collaborators. That transition will ...
Google has introduced Agentic Vision for Gemini 3 Flash, a new capability that improves how the model understands and ...
Chinese AI startup DeepSeek on Tuesday released a research paper and open-sourced its latest optical character recognition ...
Agentic Vision is a new capability for Gemini 3 Flash to make image-related tasks more accurate by “grounding answers in visual evidence.” ...
Chinese large-language model (LLM) start-ups including DeepSeek and Moonshot AI have rapidly open-sourced their latest models ...
The agent acquires a vocabulary of neuro-symbolic concepts for objects, relations, and actions, represented through a ...
Microsoft has announced Rho-alpha, a new robotics AI model derived from its Phi vision-language series, aimed at helping ...
Robbyant, an embodied AI company within Ant Group, today announced the open-source release of LingBot-World, a world model that achieves industry-leading performance in video quality, dynamic fidelity ...
Poetiq was launched last year by former Google DeepMind researchers Shumeet Baluja and Ian Fischer. Baluja established the Alphabet unit’s computer vision group, while Fischer helped develop some of ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results