What are the 5 most common consonants in the English language

What are the 5 most common consonants in the English language
Photo: Michael McWhertor/Polygon

What’s the best five-letter word to start with when trying to solve the daily Wordle? Some of us have go-to words we rely on to maximize our word-guessing and puzzle-solving skills, and there are some statistically advantageous starting words that feature a variety of commonly used letters. Ideally, you would use a five-letter word with five distinct and commonly used letters on your first guess, like “arise” or “roast.”

Similarly, you should probably avoid starting the daily puzzle with a word like “qapik,” “queue,” or “qajat” — all accepted by Wordle — because they use less common letters in English words and feature repeated vowels.

According to one analysis, the letter E appears most frequently in English-language words featured in a condensed version of the Oxford Dictionary, followed by A, R, I, O, T, N, and S. So starting words like “ratio,” “irate,” “stain,” or “stare” that include those commonly used letters are great options. There are more English-language words that start with S than any other letter, so a starting word that begins with S is also a good first guess. (If the words mentioned earlier in this paragraph come back all gray, maybe try “lurch” or “cloud” for another set of frequently used distinct letters.)

If you want a variety of vowels, the word “ouija” has all but E (and sometimes Y) and is a good starter word, even though J is one of the least frequently used letters in English words, according to the aforementioned analysis.

For those less interested in pure optimization, here’s how Wordle players at Polygon start the puzzle each day.

FRAME, GRAZE, WINDY, PAINT, GOURD, SWING, VAPES

I aim for a mix of common and uncommon letters in my first guess(es). The more common letters are to see if I can nail down the shape of a word, with those less common ones for process of elimination — just trying to avoid the heartbreak of having four of the five letters set, and a last letter that could be a number of different options. That said, I wouldn’t call myself a big strategy person. It’s more fun, in my opinion, to throw random ideas in there and see if you happen to get some leads. —Nicole Clark

AUDIO, FARTS

The first is to knock out four vowels in one go and the second is because how funny would it be to get that on the first try? Disclaimer: I definitely got that idea from someone on Twitter, but it’s too good. —Petrana Radulovic

ADIEU, OUIJA

I’ve fallen off the Wordle bandwagon over the past week, but my go-to strategy is to start with a five-letter word with as many vowels as possible so as to figure out the general composition of the day’s word. After that, it’s just blind guessing. —Toussaint Egan

READY, PEARS, CHIEF, TOUCH

Big fan of never playing the same word twice. Prefer the chaos. Peace is knowing the goal of Wordle is not to get the correct word in the fewest turns, but to get the job done in the space provided. Wordle is life. That said, if I’ve decided to Wordle before coffee, I’ll default to one of these words that seem to check those Wheel of Fortune “top letter” boxes. —Matt Patches

ARISE

I stole this off someone on Twitter. It has the two most important vowels, and arguably the two most important consonants to identify or rule out. It rarely draws a blank and usually gives you a decent lead on the second line. It appeals to the puzzle-solving, min-maxing side of my brain. But I’m not sure I like playing Wordle this way. It’s strategic but it has no flair; it’s almost cheap. And I miss the thrill of trying to intuit something with a first guess out of the blue. —Oli Welsh

ROAST, TEARS, MEATS, OUIJA, PIZZA

Like several people at Polygon, I revel in the uncertainty of a new word each day. But, I’m also human, which means I do have fallback words for those mornings after I scored in five or six tries the previous day. Also, I stole “roast” from Nicole Carpenter and “ouija” from Toussaint. I’m like a starting-word Robin Hood, if Robin Hood just kept everything he stole, and were nothing like me at all. —Mike Mahardy

POETS, EARLY, STEAM, BOILS, SPOUT, COUNT, WOUND, STEAK

I’m in the same boat as other people here (hey, “boats” isn’t a bad one) — I like the challenge of coming up with a new starter word every day so Wordle doesn’t start to seem like a rote, mechanical task. The point (ooh, “point” is pretty good too) is always to hit at least two vowels and a couple of the more common consonants without repeating any letters. But I prefer the feeling of coming up with a new starting point every day and seeing what lands over trying to whittle down the skeleton key that might fit every lock. —Tasha Robinson

N/A

Wordle has been a balm during a tough time in my life, a fun daily thought exercise where I can attempt to clear my mind and focus on an energizing challenge for a few minutes. (Or, uh, many minutes.) As such, I’ve tried to just take each day’s puzzle as it comes and not think too much about the opening gambit. In other words, I come up with a new starter word every day — whatever floats to the top of my brain at that particular moment.

I mean, sure, you want to follow the basic rules of Wheel of Fortune and Scrabble with respect to letter frequency — i.e., it probably doesn’t make sense to start with a word that has J or Z in it — and it’s a good idea to get at least two vowels in there. But that’s about as deep as I go into Wordle strategy. Any deeper, and it starts to feel like I’m calculating probabilities rather than thinking about the wondrous variety of the English language. And nobody prefers math games to word games. —Samit Sarkar

What are the 5 most common consonants in the English language
Photo by Claud Richmond on Unsplash

The game Wordle has won the heart of social media in the past few weeks. Wordle is basically a word game, where the player tries to guess a 5-letter word in 6 guesses (tries), where the player progressively receives more information about the target word. The game is created by Josh Wardle, an artist and engineer. Wordle starts when the player submits their first 5-letter word. Every time a word is submitted, feedback is provided on each letter of the submitted word, indicating if the letter exists in the target word, and if the spot matches that in the target word. Below is a screenshot of the instructions.

What are the 5 most common consonants in the English language
Rules of Wordle (screenshot by author)

A Good Strategy

Is there a good strategy to play the game? Obviously, prior to entering the first word, the player has no information about the word and it could be one of approximately 15,000 5-letter English words. However, once the first word is submitted, the player will gain more information on letters involved in the target word, depending on the entered word. Is there a good strategy once the player starts receiving feedback? Perhaps there is one. After feedback on the first word is provided, success would depend on many factors including the players vocabulary and how they can narrow down their next guess based on the feedback. However, the choice of the first word is independent of the player’s vocabulary or language skills. That is why, we can perhaps talk about a strategy that would provide the best feedback (one with as much information as possible) after the first word is submitted. Basically, a good strategy for the first entered word would be one that tries to eliminate as many remaining letters as possible. Better yet, a good strategy for the first entered word would be one that can determine as many letters of the target word as possible with as many correct placements of those letters. In this analysis, I am trying to find a strategy, or rather a word, that can serve this purpose.

A Closer Look at the Words

Based on this article on Wikipedia, the Webster’s Third New International Dictionary of the English Language contains 470,000 entries. However, a portion of these words are obsolete or may not fall into the category of valid single words that contain only letters (no numbers or symbols). I found a dataset of such words at this repository on Github. The file contains 370,103 English words that are single and contain only letters. After extracting only 5-letter words from this list, I was left with a list of 15,918 words. I will explore this list to hopefully gain more insight into a good strategy for the first word entered into Wordle. Perhaps unrelated to this little project, but I was curious to find the distribution of words frequency based on number of letters and the following was the result. Apparently, the frequency is unimodal with a peak at words with 9 letters. The 5-letter words constitute just approximately 4.3% of all words in this list.

What are the 5 most common consonants in the English language
Frequency of words (image by author)

Next, I will review two different strategies, the Vowel Strategy and the Frequency Strategy. I will show that the Frequency Strategy is a better strategy and we will pick the best word based on the Frequency Strategy.

The Vowels Strategy

Vowels play an import role when trying to come up with a strategy to eliminate large numbers of words each round. This is because at least one vowel exists in each syllable of the word. There are 5 vowels: A, E, I, O and U. Even though the letter Y can act as a vowel in some words, I did not consider it a vowel here. Starting the search with vowels may be a good idea because every single letter in English must have at least one vowel (well this is not 100% true, as we will find a bit later, we would be able to find 8 words without any vowels, although not bringing the merit of this strategy into question).

I started my search through my list of 5-letter words by finding the number of words with one, two, three, four and five unique vowels. For instance, the word asana has only one unique vowel and the word alibi has two. Turns out, there are 6223, 8568, 1055, 18 and 0 words with 1, 2, 3, 4 and 5 unique vowels, respectively. For example, the words adieu and auloi (plural of Aulos, an ancient Greek wind instrument), Aequi (an ancient Italian tribe) and uraei (plural of Uraeus the upright form of an Egyptian cobra) all have 4 unique vowels. Needless to say, there were no 5-letter words that consisted of only vowels.

There were also 46 5-letter words, where the letter Y acted as a vowel, e.g., in words ghyll (a ravine or narrow valley in the North of England) or Scyld (a legendary Danish king). There were also 8 words without any vowels such as crwth, which is a a type of stringed instrument.

Considering how important vowels are in the English language, a strategy based on vowels would be to use first words that contain as many unique vowels as possible. This will help us determine the existence or absence of as many vowels as possible in the target word. As mentioned above, there are no 5-letter words that consist of only vowels. However, there are 18 words that consist of 4 unique vowels. These words include: adieu, aequi, aoife, audio, aueto, auloi, aurei, avoue, heiau, kioea, louie, miaou, ouabe, ouija, oukia, ourie, ousia and uraei.

One may argue that any of these 18 words would make a good first try at Wordle. However, let’s see if any of the 5 vowels are any more/less frequent in 5-letter words. The following shows the frequency of appearance for each of the 5 vowels in 5-letter words (not counting unique appearances, i.e., for letter A, the word asana counts as 1).

What are the 5 most common consonants in the English language
Frequency of vowels (image by author)

The graph above shows that the vowel U is the least frequent of the 5 vowels. Filtering out from the list of 5-letter words with 4 unique vowels, words that contains U as a vowel, we are left with a list of just two words, Aoife (an Irish feminine given name) and Kioea (a Hawaiian bird that became extinct in the 19th century). A quick search through the list shows that the consonant K appeared in 1663 5-letter words, whereas the consonant F appeared in 1115. Therefore, this strategy would suggest the word Kioea. It is important to mention that this strategy completely ignores the placement of vowels in the word and only determines the existence or absence of them in the target word. We will see in the next section, how the Frequency Strategy outperforms the Vowels Strategy.

The Frequency Strategy

The previous strategy only focused on the vowels. This strategy, however will focus on all of the letters. We will evaluate the most frequently used letters in the alphabet and will also determine the most frequent placement of top most frequently used letters in 5-letter words. Based on those, we will determine the best words to be entered first into the game.

I found the frequency of occurrence of each letter in the alphabet in the 5-letter words in the dataset and sorted them from largest to smallest. The following graph shows the frequencies.

What are the 5 most common consonants in the English language
Frequency of letters (image by author)

In the above graph, each occurrence of a letter in a word was counted as 1. So I decided to look at the average frequency of letters per word to see if it was any different from the above. Looking at the average frequency of letters in 5-letter words, I did not see any difference in the order of letters, sorted from most commonly appearing to least commonly appearing (see below).

What are the 5 most common consonants in the English language
Average frequency of letters (image by author)

This means the top most commonly used letters in 5-letter words (in terms of total frequency as well as average frequency) were the letters A, E, S, O, R, I, L, T, etc. I decided to focus on the top six letters since the average frequency dropped significantly after the sixth letter. There are 96 words that are made up of only these letters (repetition allowed). However, if we agree that the purpose of the first letter is to eliminate as many remaining letters (or determine as many letters in the target word) as possible, perhaps we should restrict repetition of letters. If we don’t allow for repetition, the list will reduce to only 12 words. These words are: aesir, aries, arise, arose, ireos, oreas, orias, osier, raise, seora, serai and serio. Which one of these 12 words would be the best first word in Wordle?

To answer this question, I decided to look at the frequency of appearance of each of the top six letters in each spot of the 5-letter words (first letter, second letter, etc.). The result is shown below.

What are the 5 most common consonants in the English language
Frequency of letters in each spot (image by author)

I also calculated the average frequency of the top six letters in 5-letter words to see if it shows any significant difference from the absolute frequencies but it did not turn out to be different. The average frequencies are calculated by dividing the absolute frequencies by the number of 5-letter words, in which that particular letter appears in that particular spot. The average frequency plot is presented below.

What are the 5 most common consonants in the English language
Average frequency of letters in each spot (image by author)

This shows for example, that the letter S frequently appears in 5-letter words as the fifth letter, but it is almost never appearing as the third letter. Based on this, I used a simple scoring system to assign a score to each word, which basically consists of the sum of average frequencies for the letters based on above results. This scoring system will assume that the 6 letters are all valued equally and will only focus on frequencies per spot. For example, the score for the letter aesir will be calculated as approximately 0.1619 + 0.2928 + 0.1162 + 0.2771 + 0.1840=1.032, since the average frequency of the letter A in the first spot is 0.1619, average frequency of the letter E in the second spot is 0.2928, and so on. The table and figure below show the calculated score for all 12 words in the list.

What are the 5 most common consonants in the English language
What are the 5 most common consonants in the English language
Score of top words (image by author)

Based on this analysis, the word Aries (Latin word for ram) has the highest calculated score. It is shown that if used as the first word entered into Wordle, on average, the word Aries can determine the largest number of letters in the target word.

What are the 5 most common consonants in the English language
Aries is the Latin word for ram. Photo by Livin4wheel on Unsplash

Testing

To test the effectiveness of Aries to identify letters in the target word, I used a random selection of 5000 words from the list of 5-letter words, and calculated how many letters, on average, would be indicated when the word Aries is used as the first word on Wordle. I replicated this process 10 times. The following shows that the average number of letters (per word), whose existence in the target word identified after Aries was used as first word, was between 2.055 and 2.1. Please note, the following result does not separate letters, whose spot was correctly identified and those who weren’t. It simply includes all the letters that were identified in the target word. In other words, all the letters that turn Gold and Green after the word was entered.

What are the 5 most common consonants in the English language
Result of simulation for average number of letters identified when Aries was used as the first word

I conducted the same analysis for the word Kioea (which was suggested by our Vowels Strategy), and the result was an average of only 1.79 letters identified. This is an indication that the Frequency Strategy was superior in indicating letters in the target word to the Vowel Strategy.

Next, I calculated the average number of letters (per word), whose actual spot in the target word was correctly identified by the word Aries. This means, not only is the letter identified, but its spot in the target word is also correctly identified. In other words, this is the average number of letters that turn Green after the word is entered. For the simulation I again used 10 replications and 5000 randomly selected words in each replication. The following shows the results for Aries.

What are the 5 most common consonants in the English language
Result of simulation for average number of actual spots of letters correctly identified when Aries was used as the first word

I ran the same analysis for all the 12 words in the list of top words to see if any of them could beat Aries. As expected, the word Aries demonstrated the highest value for average number of letters (per target word), whose spots were correctly identified. For this analysis also I used 10 replications and 5000 randomly selected words in each replication and reported the average across all 10 replications.

What are the 5 most common consonants in the English language
Result of simulation for average number of actual spots of letters correctly identified for all the words in the top words list
What are the 5 most common consonants in the English language
Average number of letter locations correctly identified (image by author)

Based on the results of this study, if used as the first word, the word Aries can correctly identify the existence of approximately 2.07 letters on average and the correct spot of approximately 0.6 letters, on average, will be correctly identified.

Conclusion and Note

What are the 5 most common consonants in the English language
A caravanserai. Photo by mostafa meraji on Unsplash

I realized later that, unfortunately, Aries is not a word on Wordle’s list of accepted words, and neither are the next best words on the list Orias and Serio (based on the word scores identified above). The next best word on the list was serai, which is another word for caravanserai or inn and is indeed on Wordle’s list of accepted words. The origin of the name is Persian and Turkish, with slightly different pronunciations (saray or sarāī, also see caravanserai). In terms of average frequency of letters and letter spots identified in our testing model, both serai and Aries have the same average frequency of letters in target word correctly identified (approximately 2.07 letters on average). However, the word serai has a slightly lower average frequency of letter spots correctly identified (approximately 0.47 compared to 0.58 for Aries). Below, you see serai used as first word on the Wordle of January 16, identifying the existence of 3 letters, with the spot of two of them correctly identified.

What are the 5 most common consonants in the English language
serai used as first word on Wordle on January 16 (image by author)

In conclusion, I am not sure if the selection of words for Wordle is a completely random process. You may argue that some words may have had some reference to daily global events (see here for a list of past Wordle words in 2022). And after all, it may not be too much fun playing based on an analysis or strategy.

Happy Wordling everyone (although Wordling is probably not on Wordle’s list of accepted words)!