AI Research Dispatch

AI research dispatch tracking model releases, lab updates, benchmarks, safety work, and academic papers with links to primary or source materials and context.

ai-research EN

Evaluate AI agents systematically with Agent-EvalKit

Teams building AI agents typically evaluate them the way they evaluate any other software: by checking whether the output matches expectations. But agents that autonomously choose tools and sequence …

ai-research EN

How frontier teams are reinventing AI-native development

Frontier teams are not just using AI to code faster. They’re redesigning how software gets built. The result is 4.5x productivity gains, in some cases more than 10x. Six engineers. Seventy-six days. A …

ai-research EN

For Robotaxis, Safety Must Be Built In, Not Bolted On

A car pulls up to the curb. The app says, “Your ride is here.” No one’s in the driver’s seat. For people who live in one of the dozens of cities now hosting robotaxi services, this is already a …

ai-research EN

DiffusionGemma: 4x faster text generation

Why diffusion for text? While the AI research community has explored diffusion-based text generation for years, applying it to large models has remained a challenge. DiffusionGemma changes this by …

ai-research EN

Powering the future of robotics in Europe

AI has the potential to help solve some of the world’s biggest challenges — not just in the digital realm, but in the physical world, too. Robotics is one of the most exciting frontiers of AI, where …

ai-research EN

The consequences of relying on AI for accurate news

It’s no secret that the last few years have seen a massive explosion in the use of artificial intelligence for general information-gathering. An even more recent trend, though, is how large language …

ai-research EN

The crucial human component in computing and AI

On April 30, the MIT Schwarzman College of Computing’s Social and Ethical Responsibilities of Computing (SERC) initiative hosted a full-day research symposium examining how artificial intelligence is …