golf-app (name TBD)

a foursome plays four games at once and one person ends up doing math instead of golfing. i put a voice caddy in their pocket that submits scores from a sentence.

0.0

01 — problem

a serious golf group plays four games at once. skins on the front, nassau across nine, stableford for the year-long ladder, and one custom game where a putt from off the green earns a beer. one person carries a notepad. that person stops playing golf.

every existing scoring app i tried optimized for stroke play and shoved everything else into a "custom games" tab that asked you to fill out a form per round. nothing handled the actual case: the math runs continuously, the standings have to refresh in front of everyone at the same time, and you should be able to enter scores without taking your hand off your beer.

02 — approach

mobile:expo (eas), react native, react query, zustand, react hook form·api:hono on railway, tRPC, drizzle·data:neon serverless postgres, ably realtime, inngest workflows·ai:mastra agents, anthropic claude (sonnet + haiku), langfuse traces·auth:better auth · magic link, api-key plugin

every part is managed and serverless because i'm one person; the only place i write infrastructure is the realtime channel topology.

the bookkeeper insight is that scoring is a transcription problem dressed up as a scoring problem. "blake got 5, jake got 4, i got 6" is structured data wearing a sentence. one mastra agent owns twenty-two tools (read round context, read player stats, configure games, submit scores, correct scores, request clarification, log shots) and the agent doesn't have to be smart — it has to be fast and correct. anthropic claude is the engine because tool-call accuracy at sonnet beats everything else i tested at this price point, and haiku covers the routing tier.

ably
chosen
  • managed presence + channels for multiplayer scoring
  • native ios/android sdks (no socket plumbing in expo)
  • single channel per round; clients fan out scores + game-state events
  • per-message billing fits a round's actual traffic shape
raw websockets
  • requires a stateful server (railway is fine, but presence isn't free)
  • ios background-disconnect handling falls on me
  • recovery semantics on flaky cellular = bug surface

did not consider supabase realtime because the round channel needs presence and history-replay together, and ably is the only option that ships both as primitives without a tradeoff post.

haiku for routing + scoring; sonnet for caddy advice
chosen
  • routing decisions are structured (which tool, which player, which hole)
  • haiku tool-call accuracy is sufficient on structured outputs
  • per-message cost drops ~6× on the hot path
  • perceived latency drops because the routing tier returns first
single-tier sonnet for everything
  • caddy advice doesn't need to be on the hot path
  • sonnet on every voice message wasted budget on routing
  • first-token latency dominated TTFT for short utterances

the sub-agent split is older than i'd like to admit — it took me three weeks to convince myself that haiku was good enough at structured outputs and sonnet only needed to handle the actually hard reasoning.

rendering diagram…
voice command lifecycle
voice-to-action latency was the whole product

the fix wasn't a bigger model — it was removing a round-trip. the api now builds the full round context (active hole, players, scores, current games, weather, golf bag) in parallel and injects it as a system message before the agent runs. the agent still has a read_round_context tool for repair queries, but it never needs it on the happy path. one fewer round-trip, on the hot path that matters most.

what i tried firstevery voice message read round context as a tool call before deciding what to do.
why it failedthat was one llm round-trip plus one db round-trip in series before any output token. on a putting green over LTE, the user was already typing it manually by the time the agent answered.
ai-submitted scores were silent on every other phone in the round

rewrote with a round-service.factory.ts that constructs the full service graph (realtime, transactions, course service) for every tool invocation. tools no longer hand-build their dependencies — they receive a fully-wired service. caught it because i was in a round myself with three friends and noticed the standings didn't move on their phones for thirty seconds. the kind of bug you only find by playing the game.

what i tried firstaction tools instantiated their own roundService inside the tool handler, kept it lightweight.
why it failedthe lightweight service had no realtime publisher, so scores wrote to the db successfully and never published an ably event. the scorecard didn't refresh anywhere except the device that issued the voice command. silently broken — the worst kind.
try itfour players · six holes · three game types. tap any score; watch all three standings recalculate.
player123456
par435443
bo
blake
jake
matt
skins
1matt2
2bo0
3blake0
4jake0
stableford
1bo10
2jake10
3blake9
4matt8
stroke
1bo25
2jake25
3blake26
4matt27

tap any score to bump it. three game types recompute live. in the app, your voice does this with one sentence: "blake got 5, jake got 4, i got 6."

03 — result

22
ai tools, one agent
11
game types built-in
333
tests in the golf package
1 fewer
llm round-trip per voice message
context pre-load

one full mastra agent, two model tiers (haiku for routing, sonnet for caddy advice), realtime fan-out via ably. private beta; rounds and lighthouse data published once the public launch lands.

⌘Kterminal