Given how much time I spend producing text, I've spent shockingly little of it considering the tradeoffs of various modes to input it.
I haven't been completely mindless, but text input is the vast majority of my work so even a small improvement could yield a significant upgrade. A plane ride's worth of research and reflection seems like a worthy investment.
So what are the key criteria that matter for text input?
Speed: I don't want my input device to be the bottleneck (especially important when writing 🖊️)
Navigation: Searchable and editable (especially important when programming 👩💻)
Access: Mobile and easy to alter no matter where you are (especially important for on-the-fly notes 🚶♀️)
Courtesy: Private, silent, and appropriate (especially important when in a shared environment 🙊)
I'll go through each one a bit of detail, but here is a summary of my takeaways: (Sorry, the table doesn't render great on mobile...)
Average speed for adult
Pros and cons
Printing: 13 wpm, with a range of 5-20 wpm
Cursive: ~17 wpm (about 1/3 faster than print form)
✅ Easy to carry around or find a pen+paper almost everywhere
✅ More appropriate for taking notes in meetings
✅ Doesn't require a charge
❌ Not searchable or easily editable in-place later
❌ Usually requires transcription for publishing purposes, which further reduces the effective rate
Thumbs: 11-15 wpm seems like a standard speed
Swype: Roughly the same as thumbs! The creators claim that a veteran swypist can reach 50 wpm, so perhaps the ceiling is higher
✅ Almost constant access to phone
❌ Inappropriate to take notes in meetings, looks like you're just screwing around
❌ Longer lag time to open apps, etc
✅ Easier to refer to notes and online resources (which can be a blessing as much as a curse)
❌ Easier to get distracted by other things going on
Speaking (free-form, i.e. not reading from existing text)
Audiobooks: 150–160 wpm recommended
Slide presentations: 100–125 wpm recommended
✅ No way to self-censor, which can be especially useful when writing a draft and you just need to get the ideas out
❌ Words more likely to be fillers, less "punch" per word
❌ Hard to navigate and to edit the text
❌ Not appropriate in many shared spaces
❌ Somewhat awkward at first
Reading aloud: 184±29 wpm
❌ Limited usefulness in this context, just including this for reference
My first dimension of comparison for various modes was speed, but then I realized that's just one of many tradeoffs, whose relative priority is totally context-dependent. Speed is just the easiest to measure.
So when is speed most important?
For me, slow input can be the bottleneck of getting down notes and rough drafts, which can be enough to cause me to lose the thought entirely.
It is less of an issue when programing however; the limiting factor is my speed of thinking about the next step for solving the problem and holding all of the state in my head. Fast input is always better than not of course, but it's not so critical here.
It's worth noting that the speed test is at best a rough proxy of what we actually want. It assesses the speed of transcribing a stream of random words from a screen, which is not the same task as composing text. When you're formulating new ideas, there are a lot more stops and starts that slow down output that have almost nothing to do with the input itself and everything to do with the speed of your thought process. But it does give us a useful point of comparison for where we might experience a choke point, and with handwriting I often feel like I'm running up against my ability to write faster.
Takeaway: When just getting words out or generating content for something like a rough draft, audio notes are the way to go when possible. Typing is also a fine when the environment isn't conducive to voice-to-text, but I should avoid handwriting if I'm going for a brain dump because it's so much slower.
Some tasks require more fine-grained control of where you are in a body of text. For these, precise movements are more important than a flow of content generation. This tends to happen when the thing you're working on has a complex, non-linear structure.
So when is navigation most important?
Programming is as much about reading and editing text as it is about writing it, so it's important to be able to navigate it fluidly.
Writing by comparison is much more linear, so navigation isn't quite so critical. (Again like all of these dimensions it's generally better to have increased ease of navigation than not, but its relative importance is lower.) However once I'm ready to edit a draft I've produced, navigation becomes more important again because I have to peck around and make small insertions and deletions here and there.
Typing with a full keyboard easily wins. I do most of my writing inside of Evernote; while I jot notes on mobile and have gotten into the habit of using voice-to-text to compose drafts, I always move to the Mac app when I'm in editing mode. Mobile editing is a total nightmare, and it's downright unfeasible with an audio note. With a full keyboard shortcuts + a mouse make jumping around incomparably better. That's especially true in code editors, which further optimize for this use case with fancy menus, minimaps, extensible keymaps, and more. I personally use VSCode, which has awesome tooling for skipping around a code base, and it makes me shudder to consider writing code in something that doesn't have those features. But I can't even imagine doing so with voice-to-text, my thumbs on a mobile device, or with handwriting.
Takeaway: Whenever I'm doing something navigation-intensive I should move to my computer, but I can still generate drafts and notes via alternative input modes that do better on other dimensions. For instance I can do an audio note to compose a first draft for a blog post, but then polish it up when I get back at my computer. This isn't exactly new information for me, but I sometimes will lazily edit something on my phone just because it's in reach, but upon reflection I realize I really should just get up and grab my computer.
For creative work you can't really predict or control when you'll have a flash of inspiration, so having convenient methods to jot down thoughts as they come can be the difference between acting on it and losing it.
So when is access most important?
You may want to write without being tied to your desk. The downside of mostly working in text is that you find yourself stuck on your butt all day long. I for one do a lot of my thinking while walking, which isn't so amenable to keyboard typing or handwriting.
When you need a small form factor
Something that doesn't require a charge can be useful
Easy to pull out at a moment's notice, perhaps at a meeting
I've cobbled together a few solutions, but they're each quite context dependent:
Voice-to-text and mobile typing work well for when you're walking (and voice-to-text also works when you're biking... don't tell my mother), though it does limit me to more of a content generation mode than an editing/refining one, and I can't code at all from my phone.
Handwriting in a notebook checks some of these boxes, but doesn't work so well when you're moving around.
A few months ago I got an exercise bike desk, which lets me type while cycling. This is a good win for being more active as I work, but doesn't address the other issues.
This dimension, access, seems to be most at odds with the other priorities, too. It seems like a hard design problem to make something that's both portable and expressive in the way a great desktop code editor is.
I don't have any great takeaways for how I should change my approach here, but I'm all ears if you have better solutions!
One factor I nearly forgot to take into account but which shapes my choices a lot is courtesy. Voice-to-text has become my new favorite input method, but I rarely use it in certain contexts simply because it's quite annoying to listen to another person speak into their phone in monotone. Similarly, I never take meeting notes on my phone not because it's particularly bad on the other three dimensions—in fact it scores quite well!—but because it inevitably looks like I'm just screwing around with Candy Crush or checking my email, even if I'm completely focused on what my colleagues are saying.
Courtesy generally serves a constraining factor rather than a goal to build towards. One strategy I've developed to make voice-to-text seem less odd to bystanders is to talk into my phone up to my ear as if I were on a call (rather than talking into it like a microphone as most people do with voice-to-text). Somehow the norms for calls are different from those for audio recordings despite being all but indistinguishable to an uninterested onlooker, so this allows me to get away with more audio input than I'd feel polite doing if it were obvious.