Harness Engineering: State of the Art in Agent Harnesses

Harness Engineering: State of the Art in Agent Harnesses

Presented by: Shashi Jagtap

JUN29
Start

Monday, June 29, 2026

05:30 PM GMT-4

JUN29
End

Monday, June 29, 2026

11:30 PM GMT-4

Price

Free

Free entry

In person

AWS Builder Loft

AWS Builder Loft, San Francisco, California

About the Event

Harness Engineering is hot topic at the moment and this is a highly technical event focused on Harness Engineering for AI agents, especially cover the Coding Agents.

As models become more capable, the biggest performance gains are increasingly coming from how agents are orchestrated, evaluated, and controlled, not just the models themselves. It’s the tight, inseparable dance between harness and memory in agentic AI. Recent advancements across the ecosystem are showing that smarter harness design (planning loops, context management, verification mechanisms, error recovery, and execution runtimes) can deliver outsized improvements, often outperforming model upgrades alone on the same underlying LLM.

This has made harness engineering one of the most critical and exciting areas in agent development right now. Harness Engineering 🧰Harness Engineering Innovations in recent days making lot of waves in this space that we can cover in the event. Also discussion around memory and harness and self-optimising harness for coding agents. Coding Agents companies, Agents labs and observability labs are making tremendous progress in this spaces. This session brings together engineers who are actively building these systems in practice. We’ll dive into how coding agent harnesses are really designed, what commonly breaks in real-world deployments, and how teams are evolving toward more robust, scalable, and maintainable agent architectures. 🎤 Speakers

Talk 1: The Harness Is the Product: How to Build, Break, and Evaluate

🎙️Speaker: Dat Ngo (AI Architect, Arize AI)

Description: AI agents do not work because the model is smart. They work because the surrounding system makes intelligence usable.That surrounding system is the harness: the sandbox, tools, context, memory, evals, traces, policies, and recovery loops that shape what the model can do. In practice, the harness often matters as much as the model itself. This talk will dissect the AI harness piece by piece, then evaluate it as a whole. We will look at what happens when you run weaker models inside stronger harnesses, how execution conditions change task completion, why sandbox design affects behavior, and how observability and evals reveal whether the system is actually improving or just appearing to work.The core question: can your AI system survive outside the happy path?

Talk 2: Stop Tuning One Harness at a Time!

🎙️Arun Kumar : CTO and Cofounder of RapidFire AI

Description: You've picked a frontier model for your agent. Great. But whether it attains production quality comes down to its harness - system prompts, retrieval strategy, workflow structure, hyperparameters and more. That is a massive design space, and most teams painstakingly trudge through it one config at a time. RapidFire AI's "hyperparallel experimentation" transforms that slog into a systematic search optimized for application outcomes: compare even 1000s of configs on one machine with live eval metrics, and control configs in flight programmatically, manually, or via a promptable actionbot - to reach better metrics faster and with lower token spend. Speaker Bio: Arun Kumar is CTO and Cofounder of RapidFire AI, an open-source platform that helps AI developers and FDEs engineer AI agent outcomes to escape pilot purgatory. He is also a professor of computer science and data science at the University of California, San Diego. He has worked at the intersection of Data, AI, and Systems for over 15 years and has created among the world's first university courses on both ML Systems and modern AI Engineering.

Talk 3: Harness Optimization Through Live Traffic Analysis: Closing the Gap Between Benchmarks and Real Users 🎙️ Myeongsoo Kim : Applied Scientist @ AWS AI Labs (Kiro)

Description: Coding agent benchmarks tell you how your agent could perform; live traffic tells you how it actually performs. In this talk, I present a live-traffic-driven approach to harness optimization for Kiro, Amazon's AI coding agent. Using LLM-as-judge on millions of real conversations, we built complaint maps revealing what users actually struggle with — then used those signals to drive system prompt updates, hyperparameter tuning, and trajectory-level failure analysis. I'll share the evaluation methodology, surprising gaps between synthetic benchmarks and real-world performance, and practical lessons for building a continuous improvement loop for coding agents.

Talk 4: Fresh context for coding agent 🎙️ Linghua Jin Cofounder & CEO @ CocoIndex

Description: As models get smarter, the bottleneck shifts to what you put in front of them. A coding agent is only as good as the context it sees on each turn and that context goes stale the moment code changes. CocoIndex incrementally indexes your codebase to keep a coding agent's context always-fresh and precisely-scoped — a harness that re-indexes itself as the code evolves.

✍️ Agenda 6:00 – 6:45 PM → Doors open, networking & light snacks 6:45 – 6:50 PM → Welcome from Agent Engineering HQ 6:50 – 7:05 PM → Dat Ngo (Arize AI) 7:05 – 7:20 PM → Arun Kumar (RapidFire AI) 7:20 – 7:35 PM → Myeongsoo Kim (AWS AI Lab) 7:35 – 7:50 PM → Linghua Jin (CocoIndex) 7:50 – 8:10 OM - Panel Discussion Q&A with audience 8:10- 8:30: Closing Remark, networking and leave AWS BuilderLoft 8:30 - Late : Join another AI Engineering World's Fair event for unlimited free food and drinks 😂 🙏 Sponsors: This event is sponsored by the CocoIndex and CocoIndex for food and drinks and venue provided by AWS Builder Loft.

CocoIndex

Arize AI

The venue provided by the AWS Builder Loft. What to expect: Cutting-edge talks on real-world harness architectures, memory and harness alignment. Approaches to design the Self optimising and self healing harnesses. A fire panel with builders from leading agent labs and companies High-signal networking with the people shaping this exploding discipline Who should attend? AI/ML engineers, agent builders, infrastructure teams, and developers focused on building or optimizing reliable coding agents beyond basic prompting or simple frameworks. Expect deep technical discussion and practical takeaways. About Agent Engineering HQ Events The Agent Engineering HQ is series of events primary based in San Francisco 🌉 and London 🇬🇧 started by Superagentic AI to cover all sort of engineering practices from prompt, engineering, context engineering, memory engineering, harness engineering, inference engineering, eval engineering, loop engineering, agentic engineering etc basically *.engineering around Agentic AI Engineering. You can find more about the Agent Engineering here. If you are interesting in one of the agent engineering discipline mentioned above, reach out and we will have next events sorted.

Venue Details

AWS Builder Loft

AWS Builder Loft, San Francisco, California

San Francisco

Free for Visitors

June 29, 2026 - June 29, 2026
05:30 PM - 11:30 PM
AWS Builder Loft, San Francisco

Organized by

Meet the right people here

Tap "I'm attending" above first. Once you do, you can meet the right people at this event.