Skip to content
Back to blog
Whispcal

The prompt is the product

aiprompt-engineeringgeminillmproduct-design

If you told me a year ago that the most iterated piece of my app would be a string of natural language instructions, I'd have been skeptical. But here I am, three months into WhispCal, and the AI prompt has been modified more frequently than any React component, any database schema, any UI layout.

The prompt is the product.

The evolution

The git history tells the story through a series of increasingly specific commits:

  • December 19th: "refine prompt for language management"
  • January 17th: "explicitly asking not to remove data from the existing food items in the prompt"
  • January 18th: "prompt much sharper"
  • January 18th: "prompt - added a safe way to NOT change the current tray items unless the user asks for it"
  • January 19th: "force premium to true, update overview animation and prompt"
  • March 6th: "better prompt"

That last one — "better prompt" — is the commit message equivalent of a shrug. After months of refinement, you run out of ways to describe incremental improvements to a text block.

The language problem

WhispCal's first real prompt challenge was multilingual input. I'm based in Europe, and my early testers included French and English speakers. The prompt needed to handle "two eggs and toast" and "deux oeufs et du pain grillé" with equal accuracy.

The naive approach — "parse this food description in any language" — worked about 70% of the time. The remaining 30% included gems like interpreting "pain" (French for bread) as an emotion, or "riz" (rice) as a name. The fix was adding explicit language context to the prompt, along with examples in multiple languages.

The tray preservation crisis

The biggest prompt-related crisis came in January. Users would log three items for lunch, then ask to add a fourth. The AI, being helpful, would sometimes "improve" the existing items — adjusting portion sizes, correcting nutritional values, or occasionally removing items it thought were duplicates.

This was catastrophic for trust. If you can't rely on your food log staying stable, the entire app is broken.

The fix took multiple iterations. First: "explicitly asking not to remove data from the existing food items in the prompt." This helped but didn't eliminate the problem. The AI would still occasionally merge similar items or adjust quantities.

The final solution was structural. I separated the "existing tray" from the "new input" in the prompt architecture. The existing items are presented as immutable context. The AI's job is to interpret the new input and add to the tray, never modify what's already there. The only exception is when the user explicitly asks to change something — "actually, make that chicken 200g instead."

Balancing helpfulness and constraint

The fundamental tension in prompt engineering for a consumer app is this: you want the AI to be smart and helpful, but you also need it to be predictable and constrained. Every time I made the prompt more helpful ("suggest common portion sizes"), it became less predictable ("why did it change my oatmeal to 40g?").

The art is in knowing where to draw the line. WhispCal's prompt has evolved from "be a nutrition expert" to a document with very specific rules about what the AI can and cannot do:

  • Parse new food items from the user's description
  • Estimate nutritional values based on common serving sizes
  • Handle multiple items in a single description
  • Respond in the user's language
  • Never modify existing tray items
  • Never remove items unless explicitly asked
  • Never invent items the user didn't mention
  • Return structured JSON in a specific format

Each rule is there because of a bug. Each constraint was earned through a real user hitting a real problem.

The multiline breakthrough

One of the more recent improvements — "add multiline smart input" from March 6th — seems small but changed how people interact with the app. Instead of logging one item at a time, users can now type (or paste) an entire meal description across multiple lines:

The AI parses all of it in a single request, creating multiple food items at once. This cut the average logging time in half for complex meals.

Prompt as living document

The biggest lesson from three months of prompt engineering: your prompt is never done. It's a living document that evolves with your understanding of how users actually behave, what edge cases exist, and what the model's failure modes are.

Every "better prompt" commit represents a conversation — with users who hit unexpected behavior, with the model that interpreted something wrong, and with myself about what the product should actually do.

I keep a running list of prompt improvements alongside my feature backlog. Some weeks, the most impactful thing I ship is a two-sentence addition to the prompt. No new UI. No new API. Just a better understanding of the conversation between user and machine.

That's the strange reality of building AI-powered products: your most important code might not be code at all.