The ‘secretary problem’ is too bad a match for real life to usefully inform our decisions — so please stop citing it

By Robert Wiblin

Mrs Maisel waits to apply for a job in The Marvellous Mrs Maisel season 1 episode 5.

You have probably heard of the ‘secretary problem’, also called the ‘marriage problem’ and the ‘fussy suitor problem’.

If not, you can read an explanation here.

The problem as presented is just an approximation of real life, designed to be easier to solve. Nonetheless, from time to time I have seen people attempt to use it as a guide for decision-making about things such as hiring, finding a job, or dating.

I am not one to bash a model just because it doesn’t match real life exactly. All models must simplify in order to be useful and illustrate their point, and I’m sure work on this problem helped ‘optimal stopping’ research on its way.

Unfortunately, the secretary problem is such a poor approximation of real life that we should not see it as useful for informing our actual decisions.

I came to this conclusion while preparing for a long interview with the author of Algorithms to Live By, Brian Christian.

As a reminder, here is how the problem is set up:

  • You want to fill one position (e.g. a role you are hiring for at your company).
  • You know exactly how many applicants exist.
  • You get to evaluate them randomly, one by one, finding out their rank ordering among the candidates seen so far.
  • You want to choose the best person in the sample. Every other outcome except choosing the very best person is equally bad.
  • After evaluating a person you can offer them a job. But if you reject them at that point, you can never come back to them.

The optimal solution when you have a large sample of applicants is to just observe for the first 36.8% of the sample (so called exploration), then choose the first person who is better than anyone you saw in the first 36.8% (exploiting what you learned).

Your odds of choosing the best applicant is also 36.8% (1/e), a pleasing symmetry that has made the problem more popular to teach.

So should we spend the first 36.8% of our adult lives dating casually, and then settle down with the first person we find who’s better than anyone we’ve dated so far? That would suggest men start seriously looking for a life partner at 39 — and women at 41.

The reason the problem spits out this questionable answer is obvious: every single part of this setup is a bad match for reality. Imagine a case where you’re actually deciding whether to hire or marry someone. Here are some differences:

  • People aren’t only ranked by desirable they are — they have ‘cardinal’ or relative goodness. You don’t only care about whether candidate X is better than candidate Y, but how much better they are.
  • You don’t only care about maximising the probability of choosing the best person in a sample, with every other outcome being equally bad. Rather you want to do something like maximise the desirability (or at least rank) of the person you get.
  • You can often go back to previous candidates even if you didn’t immediately accept them.
  • You can cheaply find out a lot about the distribution of candidates you’re drawing from by just observing the world around you, asking other people in your industry, and so on. If you’re an adult and have some relevant experience, your general knowledge means you probably always start out with more than 36.8% of the information you could plausibly get by testing people out one by one.
  • How many candidates are there, and do they have to be plausible candidates? We don’t know. In a big city, there might be more possible job applicants than you could test in a lifetime. So maybe we should think about it in terms of how much time you have available to test people rather than how many candidates exist.
  • In that case, how long does it take to test each person? Is one date / job interview enough? Or do you need to go out / work with them for months? This isn’t specified either.
  • In reality, you don’t get a perfect measurement of how good any particular person is — and people who look great probably aren’t quite as good as they initially seem. The longer you test someone, the more precise your measurement becomes.
  • But testing each additional person is costly in itself, sometimes very costly if you need a role filled quickly.
  • Depending on the situation, you may be able to create more than one position if a second outstanding job applicant comes along. Or you can fire a bad hire and search again.
  • In the dating case, you may be becoming more or less attractive to other people over time as you get older, or stop being able to have children.

Each of these deviations from real life is a big deal, that could materially affect the answer. Some push for more exploration, others for less exploration, and for some I’m not even sure which way they push. Adding them all together, I have no idea where things would ultimately land.

Some of these departures from reality have been dealt with by later modifications of the model. For example, mathematicians have solved the situation in which you have the ability to make offers to previous candidates, with some probability of being accepted, and the scenario in which it’s costly to test each extra person. Both are nicely described in the first chapter of Algorithms to Live By.

That’s great and I’m happy to see this research advancing, but fixing one of these deviations at at time still isn’t sufficient to make the model good enough to use in real life.

For the advice coming out of this model to beat a very realistic alternative — following conventional wisdom or your own common-sense — we’ll need to deal with many of them all at once.

The secretary problem does demonstrate the general principle that in life we should spend some time exploring, and then some time taking advantage of what we’ve learned.

But to get practical guidance about how long to spend exploring before making a choice, we’ll need a more complex model than this one.