InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
QCon London 2026: Morgan Stanley Rethinks Its API Program for the MCP Era
Morgan Stanley engineers Jim Gough and Andreea Niculcea showed how they're retooling the bank's API program for AI agents using MCP and FINOS CALM. Live demos covered compliance guardrails, deployment gates, and zero-downtime rollouts across 100+ APIs. First API deployment shrank from two years to two weeks. They also demoed Google's A2A protocol running alongside MCP.
-
QCon London 2026: Refreshing Stale Code Intelligence
At QCon London 2026, Jeff Smith discussed the growing mismatch between AI coding models and real-world software development. While AI tools are enabling developers to generate code faster than ever, Smith argued that the models themselves are increasingly “stale” because they lack the repository-specific knowledge required to produce production-ready contributions.
-
AI Model Discovers 22 Firefox Vulnerabilities in Two Weeks
Claude Opus 4.6 discovered 22 Firefox vulnerabilities in two weeks, including 14 high-severity bugs, as nearly 20% of all critical Firefox vulnerabilities were fixed in 2025. The AI also wrote working exploits for two bugs, demonstrating emerging capabilities that give defenders a temporary advantage but signal an accelerating arms race in cybersecurity.
-
Where Do Humans Fit in AI-Assisted Software Development?
An article on Martin Fowler’s blog by Kief Morris examines the role of humans in AI-assisted software engineering, arguing developers are unlikely to move fully “out of the loop.” Instead, teams may work “on the loop,” designing tests, specifications, and feedback mechanisms to guide AI agents, as industry discussions focus on how such systems should be verified and governed.
-
QCon London 2026: Rewriting All of Spotify's Code Base, All the Time
At QCon London 2026, Spotify's Jo Kelly-Fenton and Aleksandar Mitic discussed Honk, an AI-powered coding agent that enables code migrations across Spotify's codebase. The system improves migration, reducing timelines drastically and addressing complexities that traditional scripts could not. Key challenges included handling edge cases and standardizing the codebase to facilitate review processes.
-
HubSpot’s Sidekick: Multi-Model AI Code Review with 90% Faster Feedback and 80% Engineer Approval
HubSpot engineers introduced Sidekick, an internal AI powered code review system that analyzes pull requests using large language models and filters feedback through a secondary “judge agent.” The system reduced time to first feedback on pull requests by about 90 percent and is now used across tens of thousands of internal pull requests.
-
QCon London 2026: Ontology‐Driven Observability: Building the E2E Knowledge Graph at Netflix Scale
Prasanna Vijayanathan and Renzo Sanchez-Silva, both Engineers at Netflix, presented “Ontology‐Driven Observability: Building the E2E Knowledge Graph at Netflix Scale” at QCon London 2026, where they discussed the design and implementation of an end-to-end knowledge graph that models the Netflix user experience.
-
QCon London 2026: Reliable Retrieval for Production AI Systems
At QCon London 2026, Lan Chu, AI Tech Lead at Rabobank, shared lessons from deploying a production AI search system used internally by more than 300 users across 10,000 documents. Her experience shows that most failures in RAG systems stem from indexing and retrieval, rather than the language model itself.
-
AI Is Amplifying Software Engineering Performance, Says the 2025 DORA Report
Artificial intelligence is rapidly reshaping the way software is built, but its impact is more nuanced than many organizations expected. The 2025 DevOps Research and Assessment (DORA) report, titled State of AI-Assisted Software Development, finds that AI does not automatically improve software delivery performance.
-
QCon London 2026: behind Booking.com's AI Evolution: the Unpolished Story
Jabez Eliezer Manuel, senior principal engineer at Booking.com, presented “Behind Booking.com's AI Evolution: the Unpolished Story” at QCon London 2026. Manuel discussed how Booking.com has evolved over the past 20 years and the challenges they faced on their journey to incorporate AI.
-
DoorDash Builds DashCLIP to Align Images, Text, and Queries for Semantic Search Using 32M Labels
DoorDash has launched a multimodal machine learning system that aligns product images, text, and user queries in a shared embedding space. Trained on 32 million labeled query-product pairs using contrastive learning, the system improves semantic search, product ranking, and advertising relevance. Embeddings also support other machine learning tasks across the marketplace.
-
Google Researchers Propose Bayesian Teaching Method for Large Language Models
Google Research has proposed a training method that teaches large language models to approximate Bayesian reasoning by learning from the predictions of an optimal Bayesian system. The approach focuses on improving how models update beliefs as they receive new information during multi-step interactions.
-
DoorDash Builds LLM Conversation Simulator to Test Customer Support Chatbots at Scale
DoorDash engineers built a simulation and evaluation flywheel to test large language model customer support chatbots at scale. The system generates multi-turn synthetic conversations using historical transcripts and backend mocks, evaluates outcomes with an LLM-as-judge framework, and enables rapid iteration on prompts, context, and system design before production deployment.
-
AWS Launches Strands Labs for Experimental AI Agent Projects
Amazon Web Services has introduced Strands Labs, a new GitHub organization created to host experimental projects related to agent-based AI development.
-
Claude Opus 4.6 Introduces Adaptive Reasoning and Context Compaction for Long-Running Agents
Anthropic’s Claude Opus 4.6 introduces "Adaptive Thinking" and a "Compaction API" to solve context rot in long-running agents. The model supports a 1M token context window with 76% multi-needle retrieval accuracy. While leading benchmarks in agentic coding, independent tests show a 49% detection rate for binary backdoors, highlighting the gap between SOTA claims and production security.