InfoQ Homepage DevOps Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

QCon London 2026: Morgan Stanley Rethinks Its API Program for the MCP Era

Morgan Stanley engineers Jim Gough and Andreea Niculcea showed how they're retooling the bank's API program for AI agents using MCP and FINOS CALM. Live demos covered compliance guardrails, deployment gates, and zero-downtime rollouts across 100+ APIs. First API deployment shrank from two years to two weeks. They also demoed Google's A2A protocol running alongside MCP.

Steef-Jan Wiggers
on Mar 19, 2026
DevOps

Microsoft Adds DRA-Backed NVIDIA vGPU Support to AKS

The Azure Kubernetes Service team shared a detailed guide on how to use Dynamic Resource Allocation (DRA) with NVIDIA vGPU technology on AKS. his update improves control and efficiency for shared GPU use in AI and media tasks.

Claudio Masolo
on Mar 19, 2026
DevOps

QCon London 2026: Wrangling Telemetry at Scale, a Guide to Self-Hosted Observability

At QCon London 2026, Colin Douch discussed building and operating self-hosted monitoring stacks, surveyed the current tooling landscape, and explained how to build a coherent observability setup rather than treating logs, metrics, and traces as separate pillars.

Renato Losio
on Mar 19, 2026
DevOps

QCon London 2026: SBOMs Move From Best Practice to Legal Obligation as CRA Enforcement Looms

In a talk at QCon London 2026, Viktor Petersson argued that software teams are running out of time to adopt SBOMs (Software Bills of Materials) due to pending legislative changes in both the US and Europe. He walked through the current regulatory landscape, spoke on the practical mechanics of generating high-quality SBOMs and on the emerging standards for distributing the resulting artefacts.

Matt Saunders
on Mar 18, 2026
Architecture & Design

War in Iran Damages Multiple AWS Data Centers, Challenging Multi-AZ Assumptions

Earlier this month, Iranian drone strikes damaged three AWS data centers in the UAE and Bahrain, causing outages and disruptions to multiple services. The events, which affected multiple facilities within the same AWS region, sparked discussion in the community about how geopolitical conflict can directly impact global cloud infrastructure and multi-AZ deployments.

Renato Losio
on Mar 18, 2026
DevOps

QCon London 2026: Uncorking Queueing Bottlenecks with OpenTelemetry

At QCon London 2026, Julian Wreford and Oli Lane from Gearset showcased how distributed tracing and SLOs solve asynchronous observability gaps. By shifting from queue-size metrics to latency-based alerts, the team improved incident response. Key technical takeaways included using OpenTelemetry trace state for async duration tracking and wide events to uncover hidden architectural waste.

Mark Silvester
on Mar 18, 2026
DevOps

QCon London 2026: Shipping Constantly with Humans and Beyond at Monzo

At QCon London 2026, Suhail Patel, a principal engineer at Monzo who leads the bank’s platform group, described how the bank has built a developer platform capable of shipping hundreds of changes to production every day.

Matt Saunders
on Mar 17, 2026
Cloud

QCon London 2026: Your Multi-Cloud Strategy Is a Product Problem — Treat It Like One

JP Morgan Chase engineers Luis Albinati and Surabhi Mahajan argued that multi-cloud complexity can't be solved with engineering alone. Speaking at QCon London, they showed how treating multi-cloud as a product with capability mapping, demand governance, and defined users tames the chaos.

Steef-Jan Wiggers
on Mar 17, 2026
DevOps

AI Is Amplifying Software Engineering Performance, Says the 2025 DORA Report

Artificial intelligence is rapidly reshaping the way software is built, but its impact is more nuanced than many organizations expected. The 2025 DevOps Research and Assessment (DORA) report, titled State of AI-Assisted Software Development, finds that AI does not automatically improve software delivery performance.

Craig Risi
on Mar 17, 2026
Cloud

QCon London 2026: How to Run on Three Clouds at Once, and When Not to

Form3 runs UK bank payments across three clouds simultaneously. At QCon London, their engineers explained how they built their custom Kubernetes operators, cross-cloud DNS tricks, and distributed databases, and what happened when they tried to sell them in America. Spoiler: US customers wanted East/West failover, not triple-active multi-cloud.

Steef-Jan Wiggers
on Mar 16, 2026
Cloud

AWS Launches Managed Openclaw on Lightsail amid Critical Security Vulnerabilities

AWS launched managed OpenClaw on Lightsail for AI agent deployment while security concerns mount. The 250k-star GitHub project is affected by CVE-2026-25253, which enables one-click RCE, with 17,500+ vulnerable instances exposed. Bitdefender found 20% of ClawHub skills malicious. AWS blueprint provides automated hardening, but doesn't address architectural security limits.

Steef-Jan Wiggers
on Mar 15, 2026
DevOps

Elastic Releases Version 9.3.0 with Enhanced AI Tools and OTel Support

Elastic 9.3.0 is now available, featuring enhanced vector search indexing for RAG applications and significant upgrades to the ES|QL query language. The release deepens OpenTelemetry integration for vendor-neutral observability and updates the AI Assistant with better contextual analysis. Security visibility is also expanded across Kubernetes and serverless architectures.

Mark Silvester
on Mar 15, 2026
Development

Cloudflare Introduces Support for ASPA, an Emerging Internet Routing Security Standard

Cloudflare recently announced support for ASPA (Autonomous System Provider Authorization). The new cryptographic standard helps make Internet routing safer by verifying the path data takes across networks to reach its destination and preventing traffic from traversing unreliable or untrusted networks.

Renato Losio
on Mar 14, 2026
DevOps

Netflix Uncovers Kernel-Level Bottlenecks While Scaling Containers on Modern CPUs

Engineers at Netflix have uncovered deep performance bottlenecks in container scaling that trace not to Kubernetes or containerd alone, but into the CPU architecture and Linux kernel itself.

Craig Risi
on Mar 13, 2026
AI, ML & Data Engineering

Claude Opus 4.6 Introduces Adaptive Reasoning and Context Compaction for Long-Running Agents

Anthropic’s Claude Opus 4.6 introduces "Adaptive Thinking" and a "Compaction API" to solve context rot in long-running agents. The model supports a 1M token context window with 76% multi-needle retrieval accuracy. While leading benchmarks in agentic coding, independent tests show a 49% detection rate for binary backdoors, highlighting the gap between SOTA claims and production security.

Steef-Jan Wiggers
on Mar 12, 2026

Newer News

Older News

InfoQ Software Architects' Newsletter

News