A few more thoughts on the future of documents

Aug 13, 2023

I wrote last time about leaving the Typewriter behind. Folks seem to resonate with and understand that idea, at least in principle. I’ve been thinking more about it. There are some challenges and some surprises.

What is a document? We approach them by starting to think about what the subject is - that is, where is the document located in “semantic” space. Ok, and then the process of writing is to draw a boundary around some very complex conceptual realm. We wind up, by definition, with just an approximation of the more complex underlying subject.

Perhaps documents become less “rigid containers with fixed boundaries”, and more “semantic clusters or locations” that are fuzzy. An example a friend gave was their father’s medical information. At some level, that can be well-bounded data - results, current medications, transcripts of doctor conversations, etc. But health is very open ended - what my friend really wants is a way to gather and interact with all aspects of their father’s health, in many contexts - what to cook given dietary issues, whether a vacation or activity can work, which specialists might help, what patterns there are, etc. Some of these topics bleed out of the fixed boundaries.

This is a familiar problem for programmers - schema design. Whenever we build data structures, we have to figure out a schema that approximates the world we want to model. Lots of bugs and complex code come from these approximations not fitting well. AI, and LLMs in particular, might give us a chance to have a much “higher resolution” model of the world - one that is necessarily iterative and interactive, but that fits more closely. It’s like we invented floating point for cognition.

And there is something about that increase in resolution and flexibility that’s really important. We’ve seen first with the PC and then with the smartphone, if a device is powerful enough and open enough, it becomes a very useful general purpose tool. A PC or a smartphone can do almost anything - anything that can be digitized. AI may extend this into the realm of thinking and meaning. It’s a general purpose tool for thinking the way a PC was a general purpose tool for processing and rendering.

And our tools need to evolve to reflect that flexibility in the new realm. This happened with documents and the internet. How annoyed are you now if someone sends you a complex document in an attachment? Particularly if you need to work with them on it in some ways. There are industries still clinging to that mode but they seem increasingly outmoded. The same thing is going to happen with interaction. It’s going to seem odd and old-fashioned very soon if you can’t interact meaningfully with a “document” - if that document is limited to a specific set of words, a specific presentation, a specific form. This is more than just a copilot you can ask questions of - the entire form and use of the document will be open to this semantic realm, fluid and adaptive in an intelligent way.

What does this look like, and how do we bring users along for the ride? I don’t know, yet. I am experimenting with lots of ideas. Some of them annoy and irritate people, which is great - I am looking for that disruption signal I’ve written about before. There are going to be hard challenges to pull this off, just like there were the last time. Some of the last ones were purely technical, like getting collaboration to work well, and some were more product design, like getting sharing and permissions to make sense and be secure. It’s likely we will have the same mixture of challenges here - some technical ones like how to describe the presentation layer in a way that LLMs can reliably make use of, and some product ones like how users know to define and rely on trust boundaries.

Writing and communicating is so basic to what we do. It’s funny to see folks immediately reach for “I can use an LLM to make my document more easily”. That feels a bit like early TV when they just put radio shows on the air unmodified. Sure, you can do that, but there are much more interesting things to do at the intersection of LLMs and “documents” than just making the old version more quickly.

This is out there, and it will change. It’s hard to imagine working without the internet now, and sharing and collaborating is part of everyone’s daily flow. The day is coming when that is going to be true about AI - it’s going to be hard to imagine working without it.

Mike Salisbury

Aug 14, 2023

I suspect we're clinging too tightly to the word/concept of document. It feels like you're talking about a new thing that is distinct from (but related to) a traditional 'document'.

What if this new thing was some 'volume' in semantic space that covered some amount of a domain with a fuzzy boundary and evolving content within. A traditional document is probably a slice through this volume, or a view or projection of some subset of the volume, into a concrete artifact that reflects the current state of the volume. Interactions with (changes to) the document can affect the volume, but in many ways this volume is a new entity, and our interactions with it are not limited to these 'document' views of it. Certainly we will try and discover new views of and interactions with these volumes, but I think it's a mistake to think of this volume as an evolution of the 'document'. Let the volume be its own new thing, and let 'document' remain close to what it means today.

Expand full comment

James Cham

One thing that I've always found weird is that we need to refer to documents for them to be useful. Someone has to discover, understand, and then recall them when they are relevant. Otherwise they are just sitting there, slowly adding to our cloud storage bill.

Sunday Letters

Discussion about this post