- Necessarily-shared technical artifacts, specifically the language definition itself.
- The strains on people participating in conversations about those artifacts.
Note that I am not writing about limits to the growth of many other areas of the project. It may be possible to have (say) a package index that is too big, or a website that is too big, or maybe even a user community that is too big; but it's not clear that those are problems yet for Rust, and I'm presently only talking about the two specific areas above.Every natural system has factors that limit its growth. That is why the universe is not (for example) a single amoeba expanding outwards at the speed of light. A system grows (and often its rate of growth grows too!) until it starts to encounter limiting factors, and then one by one the rate-of-rate-of-change, and then the rate-of-change, and finally the overall size of the system eventually all plateau. Typical growth patterns thus look something like a sigmoid or "S-curve", gradually approaching some asymptote. At least when the limiting factors are encountered in a gradual or controlled fashion.
When a system encounters its limits in an uncontrolled or abrupt fashion, a phenomenon can occur that is more like an overshoot, or even oscillation: the limit still exists but its effect is felt more in the form of collapse or crisis. The S-curve goes up to a peak, followed by a crash on the other side. That is what you don't want.The Rust project has a few forms of process control -- that are, in essence, limits on rates of change and/or growth -- that I think are very judicious, and in part responsible for the success of the project so far. I'd like to couch the following recommendations by analogy to them. These process controls are:
- The Bors queue, generally gating changes on program-wide correctness.
- Crater runs, generally gating releases on ecosystem-wide correctness.
- Time-based releases, generally avoiding having to consider whether to slip a schedule rather than cut a feature. The decision is made by the clock, and everything not-ready gets cut.
- The CoC. It's not always remembered, but the CoC does not just talk about social justice and harassment and so forth. It also lays out boundaries around the signal:noise ratio in conversation, the use of other people's attention and time, and the need to accept tradeoffs (I wish I'd had the foresight to use the term "zero-sum" here because I should have: not every decision is non-zero-sum).
- The RFC process. This includes rules about the form, content, timing, set of participants and permitted and expected forms of discourse when discussing significant changes.
- The governance model. This includes delineation of areas of responsibility, hierarchical delegation where necessary, roles and expectations of participants and so forth.
- The language itself. Its definition. This is (unlike many parts of the project) a necessarily shared technical artifact. Everyone has a stake in it, and every change potentially effects everyone. Moreover, everyone needs to learn and understand a substantial part of it: people do not have the option to ignore parts they are not interested in. Even parts one wants to ignore will occur in shared contexts: documentation and teaching material, testsuites and validation material, compiler internals, formal models, other people's codebases, overall maintenance burden, etc. etc.
The limiting factors to the growth of the language as an artifact then, include at least the following:
- The ability for a beginner to learn the language.
- The ability for an intermediate user to feel confident, adapt to others' codebases.
- The ability for an expert or maintainer to know all (or most) of it, evaluate changes.
- The cost:benefit ratio of each new change, in terms of new and ongoing work that it incurs, vs. the number of people or use-cases that benefit from it. The costs are combinatorial with many dimensions of the project and language size, and almost always increase.
- Unsound language changes. Say: failure to maintain a critical safety guarantee.
- Reputation for overcomplexity, loss of users. Becoming the next C++ or Haskell.
- Lower-quality features with incomplete definition, testing, documentation.
- Under-used features that simply absorb effort needed elsewhere.
- Fragmentation into dialects, unsharable codebases, lower value.
- The strains on the people working on the language. Some parts of the project can be delegated, de-synchronized, proceed in parallel with as many hands are available to work on them. Not so the shared technical artifacts. To some extent, many people (and an increasingly-many people) need to be involved in nearly all changes, and that means that there's a lot of pressure both for everyone in that group-of-many to "keep up" with all the discourse occurring, and for the standard of what it means to "keep up" to gradually creep upwards as both more changes are proposed, and more voices contribute to each discussion.
The limiting factors to the growth of these strains on participating people include at least the following:
- The number of hours in a day.
- The number of paid hours in a day.
- Responsibilities and interests in things outside the project.
- Reservoirs of mental energy to understand what's being discussed.
- Trust in the judgement of everyone participating in a conversation.
- Reservoirs of psychological and emotional energy to read and discuss.
- The presumption of good faith in everyone participating in a conversation.
- Poor decisions made through exhaustion or attrition.
- Magnifying inequality: only the most-privileged, available, energetic, well-paid or otherwise well-situated participants can keep up.
- Narrowing of discourse, from careful consideration to "winning arguments".
- People acting out, burning out, behaving poorly, departing project.
- Frustration, accusation of bad faith, grudges, conspiratorial thinking, forks.
- Embrace negative space. Make a process for defining features and concepts out of the future trajectory of the language. Allow (or encourage) RFCs that say "Rust will never have X" for some value of X. While this sounds "negative" -- and it is, the word is written on the label -- it's a one-time thing where objections can be honestly considered with a long-term horizon ("never" is quite some time!) and given fair discussion, but then put to rest rather than being a perennial source of lingering conflict. A few examples where one might want to find and articulate negative spaces: paint certain categories of expressivity permanently out of the type system (eg. dependent types, HKTs, etc.), or out of the grammar (eg. parameter qualifiers, positional or keyword args), or out of the set of kinds of item (eg. anonymous record types), or out of the set of inference obligations in the middle-end (eg. constant value synthesis, implicit arguments). Put some hard limits in place, both to avoid those features themselves, and also to avoid people "putting pieces in place" that exist only to eventually-enable them.
- Front-load costs, make them explicit. Taking a page from webassembly's change process, make it clear that moving an RFC past a very early phase is going to require a commensurate investment in implementation, formalization, documentation revision, teaching-materials revision, test-writing, maintenance and so forth. Absent a way to cover the costs, defer changes "nobody has yet been found to pay for" at that stage.
- Set a learning-rate and book-length target. Try to work backwards from the amount of time and number of pages it "should take" to learn the language, or become expert in it, then cut things that go beyond that. If "teach yourself Rust in 21 days" isn't going to work, figure out what should. Three months? Six? A year? Think about languages that "definitely take too long", and pick a number that's less than that. Is a thousand page manual healthy? Five hundred? Three hundred?
- Set other mechanized limits: lines of code in the compiler, overall time to bootstrap, dollars spent per day on AWS instances, number of productions in the grammar, number of inference rules in the type system, percent test coverage, percent of documents allowed to be marked as "incomplete", etc. Get creative, figure out meaningful things you can measure and then put mechanisms in place to limit them.
- Rate-limit activity based on explicit personal-time-budgeting: ballpark-figure how many hours (or paid hours) per person are realistically available without exhaustion or burnout -- including the least-privileged but still necessary participants -- and work backwards to figure out how many hours of participation and review "should be" digested per team, per release cycle, thus how much work gets scheduled. Then cut or defer things that go beyond that.
- Allow the moderator team to apply rate limits or cooling-off periods to particular discussions as a whole. Sometimes an outside perspective that a discussion is just too heated overall is an easier way to de-escalate than shining a spotlight on a single person's behaviour.
- As with the moderator team: grow an additional cross-project team that does budgeting and auditing of load levels in other teams. This can be effective because the auditing/budgeting team sees their work as helping people to say no to the right things, rather than the default stance most people will have when participating in a team, to say yes to too many things.
Happy new year, and good luck!