The current generation of LLMs is beginning to let us move from the world of syntax and process to semantics and intent. This is incredibly powerful and valuable for a bunch of reasons. Intent is much higher leverage and much more robust than process. Saying “go into the kitchen and get the milk” is denser than saying “lift your left foot 3cm at .5cm/s, swing forward at 4 degrees per ms, …” etc. And a precise process for walking into the kitchen to get the milk gets even bigger and harder once you have to write all the edge cases (“if cat present, then…”). It’s not really feasible.
Intent is higher density, by a lot, but the underlying system has to be capable of understanding and executing that intent. That’s where there are some opportunities in recent LLMs like GPT-4. But there are still challenges here. One of them is precision: the models are very rich, and very capable, but not always repeatable, and not always accurate (famously, they hallucinate).
At the intersection of these two ideas is an approach to designing programs and UX that sounds simple, but it probably as foundational as “click and drag” has become: iteration. We aren’t used to designing this way - oh, sure, we iterate a lot on our own content. We rewrite posts and images, try things and back out and go again.
But we’re not used to doing that with something we think of as a program. You don’t get to change the icon you clicked on if it doesn’t do the right thing - it’s static, it either works or it doesn’t. But in the realm of language or meaning, things aren’t that simple. Maybe you didn’t express yourself well. Maybe the compression worked too well and you need to clarify (“which milk? Oat or Almond?”). Maybe you were mistaken even, about what you wanted and need to correct (“Actually, maybe I’ll have tea”).
There will almost certainly be best practices that emerge, just as there have for click-and-drag interfaces. There will be a balance between giving the user breadth, and giving guidance, just like there is a balance between having lots of functionality and having too cluttered a graphical design - and there will probably be both good and less good practitioners of design in this space of iteration.
I believe that most of what we will wind up doing eventually will be “talking” to the computer - where talking right now is mostly defined as “text chat” but will shortly be multimodal: images, gestures, voice, etc. As those interactions get more complex, it will be harder and harder to build “rigid” interfaces like we do now. Why build a bunch of static buttons that reflect and underlying, rigid schema somewhere, when you can let the user tell you what you want, find the result in a vector database, and iterate together?
Intent and Iteration will be a foundational metaphor for user experience in the next wave of software. We got click-and-drag windows interfaces when the tech was advanced enough to give us high resolution screens and fast processors that could handle the real-time interaction. We now have new capabilities that let us handle interaction with intent and meaning in real-time. Now it’s time to build the experiences that are native to those capabilities.
I wonder what is the skeuomorphism best suited to guide users from interface to intent?