- Published on
Return of the Agents: Building SpeakyTalk at AI Valley
- Authors

- Name
- Tina Park
- @tinaparklive

This past weekend I was back at AI Valley for Return of the Agents — the third and final installment in their hackathon trilogy. 500+ hand-selected builders, one weekend, real products shipped. Selected teams are considered for Afore Capital's pre-seed program (2M+), which gives the room a different kind of energy than your average hackathon.
This was my second AI Valley hackathon, and each time I'm reminded how much the right room matters.
I built SpeakyTalk with Shreyas Sawant and Alberto Ravasini — two genuinely talented engineers I was lucky to be paired with.
What we built

SpeakyTalk is a live AI presentation agent. The pitch: we didn't build a better slide editor — we eliminated the editor.
You don't prep slides. You don't pick templates. You just talk, and Speaky generates, narrates, and updates a full presentation in real time as you speak.
- Talk about any topic → instant visuals with stats and research
- Say "explain this" → live AI expansion of the current point
- Say "search for data" → real-time web results pulled onto the slide
The framing we kept coming back to: it's not a tool you use before the stage. It's the agent that stands on stage with you.
The technical stack
For anyone curious how the pieces fit together:
- Frontend: React + TypeScript
- Backend: Supabase Edge Functions
- LLM: GPT-4o for slide generation and live expansion
- Voice (output): ElevenLabs for narration
- Voice (input): Vapi + Speechmatics for real-time transcription
- Generation: MiniMax for additional content paths
The interesting engineering problem wasn't any single piece — it was the latency budget. When the agent is on stage with you, every second of delay between "I just said something" and "it shows up on screen" feels like an eternity. Most of our build time went into shaving milliseconds off that loop.
What I actually learned
A few things I'm taking away, beyond the demo:
1. Voice-driven agents live and die by latency. You can have the smartest model in the world, but if it takes 6 seconds to respond, users feel it as broken. Streaming, parallel calls, and aggressive caching aren't optimizations — they're table stakes.
2. Multi-model pipelines are the unsung hard part. We weren't using "an LLM." We were orchestrating GPT-4o + ElevenLabs + Vapi + Speechmatics + MiniMax, each with its own latency, failure modes, and quirks. Most agent demos hide this complexity. Building it under hackathon pressure made it visible.
3. The "agent that stands with you" framing changed our scope. Once we stopped thinking of SpeakyTalk as a slide tool and started thinking of it as a co-presenter, every product decision got clearer. Naming the metaphor early matters more than I expected.
Why hackathons keep being worth it
This was my second AI Valley hackathon (the first was where my team built Oppalingo and made the finals). Both times I left with the same realization: a focused weekend in a room with hundreds of serious builders — surrounded by teams like Afore Capital where ideas can become real companies — beats most of what passes for "networking" in tech.
You write code under pressure. You ship something real. You meet people you actually want to keep building with.
Back to shipping.