Return of the Agents: Building SpeakyTalk at AI Valley

AI Valley Return of the Agents hackathon — 500+ builders

This past weekend I was back at AI Valley for Return of the Agents — the third and final installment in their hackathon trilogy. 500+ hand-selected builders, one weekend, real products shipped. Selected teams are considered for Afore Capital's pre-seed program ( $500k–$ 2M+), which gives the room a different kind of energy than your average hackathon.

This was my second AI Valley hackathon, and each time I'm reminded how much the right room matters.

I built SpeakyTalk with Shreyas Sawant and Alberto Ravasini — two genuinely talented engineers I was lucky to be paired with.

What we built

SpeakyTalk — Live AI Presentation Agent

SpeakyTalk is a live AI presentation agent. The pitch: we didn't build a better slide editor — we eliminated the editor.

You don't prep slides. You don't pick templates. You just talk, and Speaky generates, narrates, and updates a full presentation in real time as you speak.

Talk about any topic → instant visuals with stats and research
Say "explain this" → live AI expansion of the current point
Say "search for data" → real-time web results pulled onto the slide

The framing we kept coming back to: it's not a tool you use before the stage. It's the agent that stands on stage with you.

The technical stack

For anyone curious how the pieces fit together:

Frontend: React + TypeScript
Backend: Supabase Edge Functions
LLM: GPT-4o for slide generation and live expansion
Voice (output): ElevenLabs for narration
Voice (input): Vapi + Speechmatics for real-time transcription
Generation: MiniMax for additional content paths

The interesting engineering problem wasn't any single piece — it was the latency budget. When the agent is on stage with you, every second of delay between "I just said something" and "it shows up on screen" feels like an eternity. Most of our build time went into shaving milliseconds off that loop.

What I actually learned

A few things I'm taking away, beyond the demo:

1. Voice-driven agents live and die by latency. You can have the smartest model in the world, but if it takes 6 seconds to respond, users feel it as broken. Streaming, parallel calls, and aggressive caching aren't optimizations — they're table stakes.

2. Multi-model pipelines are the unsung hard part. We weren't using "an LLM." We were orchestrating GPT-4o + ElevenLabs + Vapi + Speechmatics + MiniMax, each with its own latency, failure modes, and quirks. Most agent demos hide this complexity. Building it under hackathon pressure made it visible.

3. The "agent that stands with you" framing changed our scope. Once we stopped thinking of SpeakyTalk as a slide tool and started thinking of it as a co-presenter, every product decision got clearer. Naming the metaphor early matters more than I expected.

Why hackathons keep being worth it

This was my second AI Valley hackathon (the first was where my team built Oppalingo and made the finals). Both times I left with the same realization: a focused weekend in a room with hundreds of serious builders — surrounded by teams like Afore Capital where ideas can become real companies — beats most of what passes for "networking" in tech.

You write code under pressure. You ship something real. You meet people you actually want to keep building with.

Back to shipping.