Romantic courtship is often described as taking place in a dating market where men and women compete for mates, but the detailed structure and dynamics of dating markets have historically been difficult to quantify for lack of suitable data. In recent years, however, the advent and vigorous growth of the online dating industry has provided a rich new source of information on mate pursuit. We present an empirical analysis of heterosexual dating markets in four large U.S. cities using data from a popular, free online dating service. We show that competition for mates creates a pronounced hierarchy of desirability that correlates strongly with user demographics and is remarkably consistent across cities. We find that both men and women pursue partners who are on average about 25% more desirable than themselves by our measures and that they use different messaging strategies with partners of different desirability. We also find that the probability of receiving a response to an advance drops markedly with increasing difference in desirability between the pursuer and the pursued. Strategic behaviors can improve one’s chances of attracting a more desirable mate, although the effects are modest.
It is a common observation that marriage or dating partners strongly resemble one another in terms of age, education, physical attractiveness, attitudes, and a host of other characteristics (1). One possible explanation for this is the matching hypothesis, which suggests that men and women pursue partners who resemble themselves. This in turn implies that people differ in their opinions about what constitutes a desirable partner or at least about who is worth pursuing. At the other extreme, and more in line with biological studies of mate selection (2–4), lies the competition hypothesis, which assumes that there is consensus about what constitutes a desirable partner and that mate seekers, regardless of their own qualifications, pursue those partners who are universally recognized as most desirable (5–8). Paradoxically, this can also produce couples who resemble one another in terms of desirability, as the most desirable partners pair off with one another, followed by the next most desirable, and so on. To the extent that desirability correlates with individual attributes, the matching and competition hypotheses can, as a result, produce similar equilibrium patterns of mixing (5, 9, 10).
However, while the two hypotheses may produce similar outcomes, they carry very different implications about the processes by which people identify and attract partners. If there is consensus about who is desirable, then it creates a hierarchy of desirability (11–13) such that individuals can, at least in principle, be ranked from least to most desirable, and their ranking will predict how and to what extent they are pursued by others. Historically, however, these hierarchies have been difficult to quantify. Since they reflect which partners people pursue, and not just who people end up with, one would need a way to observe unrequited overtures and requited ones to determine who people find desirable. Online dating provides us with an unprecedented opportunity to observe both requited and unrequited overtures at the scale of entire populations.
As data from online dating websites have become available, a number of studies have explored the ways in which mate choice observed online can inform the debate about matching versus competition. These studies typically focus on how specific attributes of individuals shape their browsing and messaging behavior. The results indicate that, with respect to attributes such as physical attractiveness and income, people tend to pursue the most attractive partners (11, 13, 14), while for other attributes, such as race/ethnicity or education, the overwhelming tendency is to seek out someone similar (15, 16). Thus, people compete on some attributes and match on others. While these studies provide valuable insights about matching and competition on an attribute-by-attribute basis, they do not capture the overall dating hierarchy that reflects total demand for each person in the market.
Here, we report results from a quantitative study of aspirational mate pursuit in adult heterosexual romantic relationship markets in the United States, using large-scale messaging data from a popular online dating site (see the “Data” section). We provide a crisp, operational definition of desirability that allows us to quantify the dating hierarchy and measure, for instance, how far up that hierarchy men and women can reach for partners and how reach is associated with the likelihood of getting a response. We also explore the ways in which people tailor their messaging strategies and message content based on the desirability of potential partners, and how desirability and dating strategy vary across demographic groups.
To study individual desirability, we focus on messages between users of the website in four cities: New York, Boston, Chicago, and Seattle. At the simplest level, one can quantify desirability by the number of messages a user receives and specifically the number of initial messages, since it is the first contact between a pair of individuals that most reliably indicates who finds whom attractive. Figure 1 shows the distribution of this quantity separately for men and women in each of the cities. The distribution is roughly consistent across cities, and although women receive more messages than men overall, the distributions for both display a classic “long-tailed” form—most people receive a handful of messages at most, but a small fraction of the population receive far more. The most popular individual in our four cities, a 30-year-old woman living in New York, received 1504 messages during the period of observation, equivalent to one message every 30 min, day and night, for the entire month.
However, desirability is not only about how many people contact you but also about who those people are. If you are contacted by people who are themselves desirable, then you are presumptively more desirable yourself. A standard measure of this reflected desirability is PageRank (17). Here, we calculate PageRank scores for the populations within each of our four cities (see the “Network analysis” section) and then rank men and women separately from least to most desirable. A scaled rank of 1 denotes the most desirable man or woman in a city by our measure, and 0 denotes the least desirable. It is important to emphasize that, while we use PageRank as an operational measure of desirability, we do not assume that users of the website themselves use PageRank, or anything like it, to identify attractive mates. In reality, a person might choose to message another based on an attractive profile picture, an interesting description, a good demographic match, an impressive income, or any of many other qualities. PageRank scores simply give us, a posteriori, a glimpse of who is desirable on aggregate, by identifying those people who receive the largest number of messages from desirable others.
Once we have our desirability scores, we can use them to identify characteristics of desirable users by comparing scores against various user attributes. As shown in Fig. 2, for instance, average desirability varies with age for both men and women, although it varies more strongly for women, and the effects run in opposite directions: Older women are less desirable, while older men are more so (18, 19). For women, this pattern holds over the full range of ages on the site: The average woman’s desirability drops from the time she is 18 until she is 60. For men, desirability peaks around 50 and then declines. In keeping with previous work, there is also a clear and consistent dependence on ethnicity (15, 20), with Asian women and white men being the most desirable potential mates by our measures across all four cities. The final panels in the figure show how desirability varies with educational level. Desirability is associated with education most strongly for men, for whom more education is always more desirable. For women, an undergraduate degree is most desirable (13); postgraduate education is associated with decreased desirability among women. These measurements control for age, so the latter observation is not a result of women with postgraduate degrees being older (table S2).
We now turn to the central results of our study. First, we use our desirability scores to explore whether people engage in aspirational mate pursuit (that is, messaging potential partners who are more desirable than they are) and how the probability of receiving a reply varies with the difference in desirability between senders and receivers. In Fig. 3, we show statistics for messages sent and replies received as a function of “desirability gap,” the difference in desirability ranking between the senders and receivers of messages. If the least desirable man in a city were to send a message to the most desirable woman, then the desirability gap would be +1; if the most desirable man sent a message to the least desirable woman, then the gap would be −1.
The upper curves in the top panels of Fig. 3 show the distribution of desirability gaps in our four cities. For each individual, we compute the median desirability gap over all initial messages they send and then plot the probability density of these numbers for men and women separately. The most common (modal) behavior for both men and women is to contact members of the opposite sex who on average have roughly the same ranking as themselves, suggesting that people are relatively good judges of their own place in the desirability hierarchy. The distributions about this modal value, however, are noticeably skewed to the right, meaning that a majority of both sexes tend to contact partners who are more desirable than themselves on average—and hardly any users contact partners who are significantly less desirable. The curves are remarkably consistent across all four cities, with men and women on average sending messages to potential partners who are 26 and 23% further up the rankings than themselves, respectively. A tendency for messages to go to more desirable people is to some extent implicit in the PageRank measure, which often (although not always) rates people who receive a lot of messages as desirable; however, the details of the distribution, including modal value, skewness, consistency across cities, and difference between women and men, are by no means inevitable and contain real information about partner choice and attraction.
The lower set of curves in the top panels shows the probability of receiving a reply to an initial message. The curves are higher for messages sent by women than for those sent by men—women are more likely than men to receive replies—but among both women and men, the probability of a reply is a decreasing function of desirability gap, more desirable partners replying at lower rates than less desirable ones. The differences are stark: Men are more than twice as likely to receive a reply from women less desirable than themselves than from more desirable ones, and for messages sent to more desirable women, the reply rate never rises above 21%. Yet, the vast majority of men send messages to women who are more desirable than themselves on average. Messaging potential partners who are more desirable than oneself is not just an occasional act of wishful thinking; it is the norm.
The bottom panels of Fig. 3 show two further statistics that shed light on the mate-seeking strategies adopted by users of the site. The upper set of curves show the variation of desirability gaps across the potential partners a person contacts, quantified by the distance between the 25th and 75th percentiles in the distribution of desirability gaps. Conditioned on the number of messages sent, men and especially women who reach higher up the desirability ladder tend to write to a less diverse set of potential matches, in terms of desirability gap. This behavior, consistent across all four cities, indicates that mate seekers, and particularly those setting their sights on the most desirable partners, do not adopt a diversified strategy to reduce the risk of being rejected, as one might, for instance, when applying to universities (21).
The lower set of curves in the bottom panels shows the average number of messages sent by a woman or a man as a function of average desirability gap. Women initiate far fewer contacts than men, but both sets of curves fall off with increasing desirability gap in all four cities. One might imagine that individuals who make a habit of contacting potential partners significantly more desirable than themselves (large positive desirability gap) would also initiate more contacts overall to increase their chances of getting a reply, but they do the opposite: The number of initial contacts an individual makes falls off rapidly with increasing gap, and it is the people approaching the least desirable partners who send the largest number of messages. A possible explanation is that those who approach more desirable partners are adopting a “quality over quantity” approach, more precisely identifying people they see as an attractive match or spending more time writing personalized messages, at the expense of a smaller number of messages sent.
Do mate seekers put more effort into attracting more desirable partners? On the basis of message content, there is some evidence that they do. In the top two panels of Fig. 4, the upper set of curves shows how the total length in words of initial messages sent varies by desirability gap. Both men and women tend to write substantially longer messages to more desirable partners, up to twice as long in some cases. The effect is larger for messages sent by women than by men, although there are exceptions. Among the groups we study, for instance, it is men in Seattle who have the most pronounced increase in message length (see table S3). [Of the cities studied, Seattle presents the most unfavorable dating climate for men, with as many as two men for every woman in some segments of the user population (fig. S1)].
The lower set of curves in the same panels shows a simple measure of the emotional content of messages, the fraction of positive words [based on the Linguistic Inquiry and Word Count (LIWC) database (22, 23)]. Here, we see an interesting difference between women and men: The women show an increase in their use of positive words when communicating with more desirable partners, while the men show a decrease. The effect size is modest but is consistent across all four cities and statistically significant (P < 0.001; table S4).
The bottom panels of Fig. 4 quantify the payoffs to writing longer or more positive messages, controlling for the desirability gap between senders and receivers (section S3). The expected payoffs for both men and women show a remarkably close match to the messaging behavior depicted in the upper panels. For example, in all four cities, men experience slightly lower reply rates when they write more positively worded messages. Although our analysis cannot reveal the underlying process that gives rise to these behaviors (for example, reinforcement learning), this result may offer a hint about why men tend to write somewhat less positive messages to more desirable partners. Similarly, only Seattle men experience a payoff to writing longer messages—and Seattle is the only city where men write longer messages to more desirable mates. Overall, however, the variation in payoff for different strategies is fairly small, suggesting that, all else being equal, effort put into writing longer or more positive messages may be wasted.
The results presented here provide a picture of the aspirational pursuit of mates in online dating and its implications for the likelihood of success. We present a network measure of desirability in dating that is based on mate-seeking behavior rather than subjective personal qualities such as attractiveness. We find that, while some mate seekers do pursue partners of similar average desirability to themselves, the vast majority of the online dating population we study tend to reach up the hierarchy toward more desirable partners. At the same time, this aspirational mate pursuit is calibrated to one’s own desirability: On average, people pursue partners who are roughly 25% more desirable than they themselves are. In the language of matching and competition introduced at the start of this article, it appears that people are pursuing a hybrid strategy with elements of both—they are aware of their own position in the hierarchy and adjust their behavior accordingly while, at the same time, competing modestly for more desirable mates.
We find that all but the most extreme mate seekers exhibit heterogeneity in their mate pursuit, initiating contact with partners across a range of desirabilities. This suggests that both men and women combine aspirational mate pursuit with less risky prospects. In addition, there appears to be a quality over quantity strategy such that men and women who pursue more desirable partners send fewer messages, each with a higher word count on average. Messaging strategies also become less diversified (in terms of range of desirability gaps) as people reach higher up the desirability ladder.
Our results on aspirational mate pursuit are consistent with the popular concept of dating “leagues,” as reflected in the idea that someone can be “out of your league,” meaning that attractive matches are desirable for but unavailable to less attractive others. Provided that leagues are envisaged as a single continuous hierarchy rather than as distinct strata, our results suggest that, contrary to popular belief, attracting the attention of someone out of one’s league is entirely possible. The chances of receiving a reply from a highly desirable partner may be low, but they remain well above zero, although one will have to work harder, and perhaps also wait longer (9), to make progress. Compared to the extraordinary effort male rats are willing to go through to mate with a desirable female (24); however, messaging two or three times as many potential partners to get a date seems quite a modest investment.
One might wonder how the patterns we observe online might inform our understanding of offline mate pursuit and dating markets. Online dating differs from offline dating in several important ways (25). Because of the high volume of partners and low threshold for sending a message, competition for potential partners’ attention is likely fiercer online than offline. This may increase the extent to which a hierarchy of desirability exists online and reduce people’s willingness to respond to less desirable mates: When there are plenty of fish in the sea, one can afford to throw a few back. It has also been suggested that consensus about what makes an attractive partner is strongest in the early stages of courtship, when partners do not know as much about one another (26, 27). While it is difficult to study early courtship offline—our method requires unrequited overtures, which are hard to observe in offline interactions—these differences suggest that hierarchies of desirability may be more pronounced online than off.
Online dating has grown greatly in popularity in recent years and has become an increasingly common way for people to find romantic partners, edging out more traditional means such as meeting through coworkers or through family. By 2013, the Pew Research Center (28) found that 11% of all American adults, and 38% of those who were currently single and searching for a partner, had used online dating sites or mobile apps. Two-thirds of online daters had gone on a date with someone they met through a site, and almost a quarter (23%) had entered into a marriage or a long-term relationship with someone they met through a site. Thus, online dating now plays a substantial role in the organization of sexual and romantic relationships in the United States—it is currently the third most common way partners meet after meeting through friends or in bars (29).
The data used as the starting point for our study consist of demographics and messaging patterns for active users of a popular online dating site during a 1-month period of observation from 1 to 31 January 2014. The site does not market itself to any particular demographic group and attracts a diverse population of users whose makeup, in most locales, corresponds loosely to that of the general population. The population of users is concentrated in coastal areas, although there are significant numbers of users in major midwestern cities such as Chicago. Upon joining the site, users specify a login handle and enter their age, sexual orientation, relationship status, and a five-digit zip code identifying their location. All but the zip code are visible to other users, while geographic location is publicly listed at the city level. Optionally, users can also give additional demographic information (for example, height, religion, and body type) and answer a set of open-ended essay questions that ask them to describe who they are and what they are looking for. After creating a profile, users can then view the profiles of others, as well as send and receive messages.
In addition to demographics, our data include complete messaging patterns—who sends messages to whom on the site. It is these messages that we used to assess individuals’ desirability. We restricted our analysis to active users, which we define to mean users who sent or received at least one message during the observation period. This eliminates a significant number of users who sign up and use the site but then become inactive or who sign up and never use it. For the purposes of the present study, we also removed from the data all users who identify as gay or bisexual (about 14% of the overall user base of the site) and those who indicate that they are not looking for romantic relationships. (People can indicate, for example, that they are only looking for friendship or activity partners.) Details about the demographic makeup of users in each city are shown in section S2.
We reported results for four large metropolitan areas—New York City, Boston, Chicago, and Seattle. One reason for restricting our study to individual cities is to reduce the effects of spatial distance in mate-selection behavior: We chose areas large enough to give good demographic statistics but small enough geographically that distance will not be a significant deterrent to conversation between interested users. In the case of Boston, Chicago, and Seattle, we found a good choice to be the standard core-based statistical areas (CBSAs) established by the Office of Management and Budget. A CBSA is defined to be an urban center of at least 10, 000 people plus adjacent areas that are socioeconomically tied to the urban center by commuting. For New York City, the standard CBSA proves too large: The data indicate multiple geographic dating markets within the larger metro area. Instead, therefore, we chose a narrower set of geographic boundaries for New York, the five boroughs of Manhattan, the Bronx, Queens, Brooklyn, and Staten Island. Some descriptive statistics for the user populations in the four cities are reported in table S1. Restricting our study to metropolitan areas inevitably eliminates some messaging activity to and from outlying regions, but the areas chosen here capture a large majority of the messaging activity of the users who live in them.
We constructed a network for each city studied in which the nodes represent users, and connections between nodes—directed edges in network nomenclature—represent the first message sent in the corresponding direction between any two users. That is, there is a directed edge in the direction of the initial contact between two users and, optionally, a second edge in the opposite direction if that initial contact received a reply. Our analyses are based on the largest weakly connected component of the network in each city, although in practice, this restriction has little effect since nearly everyone belongs to the largest component. In the network for New York City, for example, the largest weakly connected component contains 99.8% of all users.
Given that our focus here is on who is interested in whom, one approach might be to restrict ourselves to a network with edges representing only the first direction of contact between individuals and excluding any reply. However, a defining feature of heterosexual online dating is that, in the vast majority of cases, it is men who establish the first contact—more than 80% of first messages are from men in our data set. As a result, there is little information about women’s aspirations contained in first messages. On the other hand, women reply very selectively to the messages they receive from men—their average reply rate is less than 20%—so women’s replies (along with the small fraction of first messages sent by women) can give us significant insight about who they are interested in. To create a picture of both men’s and women’s aspirations, therefore, we include both first messages and replies in our network.
A related challenge is how to choose which users should be included in the network. One approach might be to restrict our list of active users to those who sent at least one message during the observation period. However, because, again, men send most messages, this would exclude a large number of women from the sample. To avoid this, we chose to include in our networks all users who either sent or received at least one message during the period of observation.
Desirability rankings. The directed network of initial contacts was used as the starting point for our PageRank-based measure of desirability. In this calculation, network nodes were first numbered, in arbitrary order, from 1 to n, where n is the total number of nodes in the network, and then we assigned each node i (that is, each person) a positive desirability score xi. The structure of the network itself is represented by the directed adjacency matrix A having elements aij = 1 if there is a directed edge from node j to node i and zero otherwise. Then, the scores obey the standard PageRank equation (17)(1)where α is a parameter whose value we choose. There is no formal theory specifying the best value of this parameter, but the inventors of the PageRank method (17) recommend a value of α = 0. 85, and we used that value here. (Our results are not particularly sensitive to the value of α—calculations with other values lead to qualitatively similar conclusions.) The numerical solution of Eq. 1 is straightforward: One starts with any set of nonnegative values xi, for example, xi = 0, and uses them to evaluate the right-hand side of Eq. 1, giving a new set of values xi′. Then, one substitutes these into the equation again to calculate another new set and repeats the process until the values converge within a desired accuracy. For networks of the size studied here, the calculation takes less than a second on a standard desktop computer.
There is an extensive literature on network measures of social rank. However, only a small handful of studies have used network measures to explore how social rank is associated with mating success (30–32). These studies all use eigenvector centrality, a matrix-based measure similar in some respects to PageRank but designed for use with undirected networks. These studies have focused primarily on small populations (two hunter-gatherer societies, leks of birds, and men and women in a speed dating experiment). Our study notify PageRank scores as a measure of desirability in large-scale online dating populations. Further details about the statistical models used in the analysis, as well as estimated coefficients, can be found in the Supplementary Materials.
Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/4/8/eaap9815/DC1
Section S1. Background literature
Section S2. Descriptive statistics
Section S3. Supplementary analyses
Table S1. User attributes for four metropolitan areas.
Table S2. Fractional regression of desirability on individual attributes—selected coefficients.
Table S3. Message length by desirability gap.
Table S4. Proportion of positive words in message by desirability gap.
Table S5. Probability of reply by message length, conditional on desirability gap.
Table S6. Probability of reply by percent of positive words, conditional on desirability gap.
Fig. S1. Age distribution of men (blue) and women (red) in each city.
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.
REFERENCES AND NOTES
Acknowledgments: This work was funded in part by the NIH under grant K01-HD-079554 (to E.E.B.) and the NSF under grants DMS-1107796 and DMS-1407207 (to M.E.J.N.). Author contributions: Both authors formulated the research problem, developed the methodological approach, analyzed the data, and wrote the paper. Competing interests: The authors declare that they have no competing interests. Data and materials availability: The data used in this paper are proprietary but will be made available to accredited academic researchers in a secure environment for purposes of replication and verification of our results. The data were protected under Institutional Review Board–approved guidelines for HUM00075042.