When I first published this analysis, I excluded Aesop Rock, figuring he was too obscure. The Reddit hip-hop community was in an uproar, claiming that Aesop would absolutely be #1. Sure enough, Aesop Rock is well above every artist in the dataset, and I was obliged to add him to the chart.
Literary elites love to rep Shakespeare’s vocabulary: across his entire body of work, he uses 28,829 words, suggesting that he knew more than 100,000 words and arguably had the largest vocabulary, ever.
I decided to compare this data point against the vocabularies of the most famous artists in hip hop. I used each artist’s first 35,000 lyrics. This way, prolific artists, such as Jay-Z, can be compared to newer artists, such as Drake.
# of Unique Words Used Within Artist’s First 35,000 Lyrics
(1)(2) I used the first 5,000 words for 7 of Shakespeare's works: Hamlet, Romeo and Juliet, Othello, Macbeth, As You Like It, Winter's Tale, and Troilus and Cressida. For Melville, I used the first 35,000 words of Moby Dick.
All lyrics are via Genius.
35,000 words covers 3 to 5 studio albums and EPs. I included mixtapes if the artist was short of the 35,000 words. Quite a few rappers don’t have enough official material to be included (for example, Biggie and Chance the Rapper).
This project was originally published in 2014 and recently updated in January 2019 with newer lyrics data and 75 additional artists, including Lil Uzi Vert, Lil Yachty, Migos, and 21 Savage. Notably, there’s an overall trend of fewer unique words among newer artists. This is easier to see in the following chart, where I highlighted each artist’s primary decade, based on album release dates for their vocabulary calculation (the first 35,000 lyrics).
# of Unique Words Used Within Artist’s First 35,000 lyrics
(1) Since this analysis uses an artist’s first 35,000 lyrics (prioritizing studio albums), an artist’s era is determined by the years the albums were released. Some artists may be identified with a certain era (for example, Jay-Z with the 1990s, with Reasonable Doubt in 1996, In My Lifetime, Vol. 1 in 1997, etc.) yet continue to release music in the present day.
All lyrics are via Genius.
Some of the newer artists wield a smaller vocabulary comparatively, but this is not because hip hop has “dumbed down.” The genre has evolved; it has moved away from complex lyricism toward elements traditionally associated with pop music: repetitive song structure and singing (Joe Carmanica recently wrote about this trend for the New York Times, arguing that it was led by Drake, who popularized the rapping-and-singing formula over the past decade).
A better benchmark for Lil Uzi Vert’s word count (2,556) might be those of pop artists, such as Beyonce (2,433 words), or even one his major influences: Marilyn Manson (2,466 words).
There are also genre-bending artists. If Childish Gambino’s Awaken, My Love! is less hip hop in the traditional ’90s boom-bap sense, is it fair to compare it to vocabulary-dense Wu-Tang albums? Genre matters in vocabulary calculations—check out the chart below, which takes 500 random samples of 35,000 words from rock, country, and hip hop.
# of Unique Words Used in 500 Random Samples of 35,000 Lyrics from Country, Rock, Hip Hop
Raw Lyrics Data via John W. Miller
In short, if artists depart from hip-hop song structure, we’d expect their vocabulary to go down in the number of unique words.
That said, the results are still directionally interesting. Of the 150 artists in the dataset, let’s take a look at who is on top.
For the most recent update, I pored over requests from readers, and Busdriver was most common on folks’ wishlists. He and Aesop Rock are the only rappers with more than 7,000 unique words in their first 35,000 lyrics.
Wu-Tang Clan at #5 is impressive given that 10 members, with vastly different styles, equally contribute lyrics. Add the fact that GZA, Ghostface, Raekwon, and Method Man's solo works are also in the top 20 – notably, GZA is at #4.
Of course E-40 is in the top 20%; he’s considered to be the inventor of many slang terms. Just a few that he’s been responsible for coining or popularizing: “all good,” “pop ya collar,” “shizzle,” and “you feel me.”
Outkast’s expansive vocabulary is definitely a function of their style: frequent use of portmanteaus (for example, “ATLiens,” “Stankonia”), southern drawl (for example, “nahmsayin,” “ery’day”), and made-up slang (for example, “flawsky-wawsky”).
Since both rappers are known for their speed, it’s nice to see that their verses are just as lyrically diverse as their peers’.
So what's all this mean?
io9 writer Robert Gonzalez blew my mind with this point, “On the Black Album track 'Moment of Clarity,' Jay-Z contrasts his lyricism with that of Common and Talib Kweli” (both of whom rank higher than him, when it comes to the diversity of their vocabulary):
I dumbed down for my audience to double my dollars
They criticized me for it, yet they all yell “holla”
If skills sold, truth be told, I’d probably be
Lyrically Talib Kweli
Truthfully I wanna rhyme like Common Sense
But I did 5 mil - I ain’t been rhyming like Common since
I used a research methodology called token analysis to determine each artist’s vocabulary. Each word is counted once, so pimps, pimp, pimping, and pimpin are four unique words. To avoid issues with apostrophes (e.g., pimpin’ vs. pimpin), they’re removed from the dataset. It still isn’t perfect. Hip hop is full of slang that is hard to transcribe (e.g., shorty vs. shawty), compound words (e.g., king shit), featured vocalists, and repetitive choruses.
Vocabulary count data is available here.