Site icon R-bloggers

WoRdle: Solve Wordle with R!

[This article was first published on R-Bloggers – Learning Machines, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.


Wordle is a daily word puzzle that’s taken the internet by storm: if you want to get some assistance to solve the viral online game, even in hard mode and with any (also future) word lists, read on!

There’s a new five-letter word every day that you get six guesses to get right. Letters will turn yellow if you guess the right letter but in the wrong position, or green for the right letter in the right position. Black (dark grey) means the letter isn’t in the word at all.

I am not a native speaker of English so I ignored Wordle at the beginning. Yet I saw those Wordle emojis popping up everywhere and when the New York Times bought it and I was on holiday with some time to spare I decided to give it a try. I was hooked instantly!

Then I thought that I could use a little assistance and tried to find some helpful tools. I didn’t really find what I was looking for and after a while, I decided to accept the challenge and program my own little helper tool – of course in R – which I want to share with you here.

The idea is basically a naive brute force approach with the following steps:

The syntax template of the main function is as follows:

black <- c("")
wordle(one = exclude(c("")),
       two = exclude(c("")),
       three = exclude(c("")),
       four = exclude(c("")),
       five = exclude(c("")),
       include = c("")
       )

Put the black letters (that should be excluded from the search) into the black vector and the yellow letters within the helper function exclude() into the respective positions one, two, three, four, or five and into include at the end of the function. If the same letter appears in yellow and black then don’t put it into black, only into the exclude() function of the respective position and additionally into include! Put the green letters into the respective positions without the include function.

Before we come to a few examples to illustrate the above here is the code, for spell checking we use the hunspell package (on CRAN):

library(hunspell)

black <- ""
exclude <- function(local, global = black) {
  LETTERS[!LETTERS %in% c(local, global)]
}
wordle <- function(one, two, three, four, five, include = "") {
  words <- apply(expand.grid(one, two, three, four, five), 1, \(x) paste(x, collapse = ""))
  words <- words[grepl(paste0("(?=.*", include, ")", collapse = ""), words, perl = TRUE)]
  words[hunspell_check(words)]
}

Now for some examples:

We got lucky that the first two letters are green already. The last three letters are black and therefore not included in the word we are looking for. This is how we parametrize the wordle function:

black <- c("I", "S", "E")
wordle(one = "A", 
       two = "R",
       three = exclude(""),
       four = exclude(""),
       five = exclude(""),
       include = c("")
       )
##  [1] "ARUBA" "AROMA" "ARYAN" "ARGON" "ARRON" "ARBOR" "ARDOR" "ARMOR" "ARGOT"
## [10] "ARROW" "ARRAY" "ARABY"

Because the first word is an island (I had to look it up too) we go with the second word:

Wow, we solved this one in only two guesses!

The next one is considerably harder:

Only the letter “A” is somewhere in the word but not in the first position, all other letters are not in the word at all (= black). This one runs a little longer because of the many possible permutations:

black <- c("R", "I", "S", "E")
wordle(one = exclude(c("A")),
       two = exclude(c("")),
       three = exclude(c("")),
       four = exclude(c("")),
       five = exclude(c("")),
       include = c("A")
       )
##   [1] "KAABA" "MAMBA" "ZOMBA" "YUCCA" "NAZCA" "GOLDA" "PANDA" "WANDA" "FONDA"
##  [10] "HONDA" "VONDA" "LYNDA" "GOUDA" "MAZDA" "HOFFA" "VOLGA" "MANGA" "CONGA"
##  [19] "TONGA" "OMAHA" "DACHA" "MOCHA" "BOTHA" "GANJA" "DHAKA" "VODKA" "KAFKA"
##  [28] "HAKKA" "PUKKA" "POLKA" "LANKA" "KOALA" "TABLA" "CALLA" "PAULA" "UVULA"
##  [37] "KAYLA" "LAYLA" "PHYLA" "OBAMA" "LLAMA" "MAGMA" "DOGMA" "GAMMA" "MAMMA"
##  [46] "COMMA" "GHANA" "JUANA" "HANNA" "JANNA" "MANNA" "WANNA" "DONNA" "GONNA"
##  [55] "POONA" "PATNA" "FAUNA" "COCOA" "OCHOA" "TAMPA" "KAPPA" "ZAPPA" "POPPA"
##  [64] "CUPPA" "NAFTA" "MALTA" "YALTA" "VOLTA" "MANTA" "JUNTA" "QUOTA" "GUPTA"
##  [73] "GOTTA" "OUTTA" "GUAVA" "VULVA" "FATWA" "TANYA" "TONYA" "PLAZA" "PANZA"
##  [82] "VOCAB" "NABOB" "JACOB" "HAVOC" "NOMAD" "CANAD" "GONAD" "MONAD" "BLAND"
##  [91] "GLAND" "CHAFF" "QUAFF" "GULAG" "CHANG" "CLANG" "HUANG" "TWANG" "MAGOG"
## [100] "JUDAH" "JONAH" "FATAH" "COACH" "POACH" "BATCH" "CATCH" "HATCH" "LATCH"
## [109] "MATCH" "NATCH" "PATCH" "WATCH" "LAUGH" "WAUGH" "THANH" "BAATH" "PLATH"
## [118] "LOATH" "KODAK" "KOJAK" "NANAK" "CLOAK" "KAYAK" "MUZAK" "WHACK" "BLACK"
## [127] "CLACK" "FLACK" "KNACK" "QUACK" "CHALK" "CAULK" "THANK" "BLANK" "CLANK"
## [136] "FLANK" "PLANK" "KAPOK" "CABAL" "JUBAL" "TUBAL" "FOCAL" "LOCAL" "VOCAL"
## [145] "DUCAL" "MODAL" "NODAL" "OFFAL" "FUGAL" "HALAL" "JAMAL" "BANAL" "CANAL"
## [154] "TONAL" "ZONAL" "PAPAL" "PUPAL" "FATAL" "NATAL" "OCTAL" "TOTAL" "LAVAL"
## [163] "NAVAL" "LOYAL" "UDALL" "OCAML" "KABUL" "OCCAM" "MADAM" "QUALM" "NAHUM"
## [172] "OAKUM" "DATUM" "TATUM" "LABAN" "CUBAN" "PAGAN" "HOGAN" "LOGAN" "COHAN"
## [181] "WUHAN" "GOLAN" "NOLAN" "DYLAN" "HAMAN" "UNMAN" "WOMAN" "HUMAN" "LYMAN"
## [190] "CONAN" "HUNAN" "JAPAN" "WOTAN" "DAYAN" "MAYAN" "KAZAN" "HAYDN" "JOANN"
## [199] "LUANN" "GABON" "BACON" "MACON" "WAGON" "HALON" "TALON" "DAMON" "CANON"
## [208] "CAPON" "BATON" "TAXON" "CAJUN" "GATUN" "CACAO" "MACAO" "DAVAO" "MAMBO"
## [217] "WALDO" "MANGO" "TANGO" "MACHO" "NACHO" "BANJO" "WACKO" "PABLO" "GALLO"
## [226] "LLANO" "PLANO" "GUANO" "TABOO" "MAGOO" "YAHOO" "KAZOO" "WAZOO" "PLATO"
## [235] "CANTO" "PANTO" "GLAXO" "MATZO" "UNCAP" "NAACP" "CHAMP" "CLAMP" "KNAPP"
## [244] "LAYUP" "DUCAT" "BLOAT" "FLOAT" "GLOAT" "YACHT" "FAULT" "VAULT" "CHANT"
## [253] "THANT" "PLANT" "DAUNT" "GAUNT" "HAUNT" "JAUNT" "TAUNT" "VAUNT" "CABOT"
## [262] "JABOT" "FAGOT" "FLATT" "WYATT" "YAKUT" "GAMUT" "KAPUT" "BAYOU" "BANTU"
## [271] "CANTU" "MACAW" "BYLAW" "YALOW" "GAMOW" "CALYX" "TODAY" "MCKAY" "TOKAY"
## [280] "MALAY" "GAMAY" "COPAY" "DOUAY" "NOWAY" "BYWAY" "CABBY" "GABBY" "TABBY"
## [289] "BACCY" "FANCY" "NANCY" "TOADY" "DADDY" "FADDY" "PADDY" "BALDY" "BANDY"
## [298] "CANDY" "DANDY" "HANDY" "MANDY" "GAUDY" "BAWDY" "DAFFY" "TAFFY" "BAGGY"
## [307] "MANGY" "TANGY" "CATHY" "KATHY" "FLAKY" "QUAKY" "JACKY" "TACKY" "WACKY"
## [316] "BALKY" "TALKY" "LANKY" "MANKY" "GAWKY" "BADLY" "MADLY" "BALLY" "DALLY"
## [325] "PALLY" "TALLY" "WALLY" "MANLY" "WANLY" "HAPLY" "LAXLY" "FOAMY" "LOAMY"
## [334] "BALMY" "PALMY" "GAMMY" "HAMMY" "JAMMY" "MAMMY" "TAMMY" "CANNY" "DANNY"
## [343] "FANNY" "LANNY" "NANNY" "TAWNY" "CAMPY" "HAPPY" "NAPPY" "PAPPY" "ZAPPY"
## [352] "PLATY" "MALTY" "BATTY" "CATTY" "FATTY" "NATTY" "PATTY" "TATTY" "NAVVY"
## [361] "GAUZY" "JAZZY" "TOPAZ" "BLATZ" "WALTZ" "VADUZ" "PZAZZ"

We try our luck with “MAMBA”:

The letters “M” and “B” are not in the word, we have the letter “A” in the second position but there is no second “A” (in black), therefore we include it in the black vector, too:

black <- c("R", "I", "S", "E", "M", "B", "A")
wordle(one = exclude(c("")),
       two = "A",
       three = exclude(c("")),
       four = exclude(c("")),
       five = exclude(c("")),
       include = c("")
       )
##  [1] "HAVOC" "CATCH" "HATCH" "LATCH" "NATCH" "PATCH" "WATCH" "LAUGH" "WAUGH"
## [10] "CAULK" "KAPOK" "HAYDN" "WAGON" "HALON" "TALON" "CANON" "CAPON" "TAXON"
## [19] "CAJUN" "GATUN" "WALDO" "TANGO" "NACHO" "WACKO" "GALLO" "YAHOO" "KAZOO"
## [28] "WAZOO" "CANTO" "PANTO" "LAYUP" "YACHT" "FAULT" "VAULT" "DAUNT" "GAUNT"
## [37] "HAUNT" "JAUNT" "TAUNT" "VAUNT" "FAGOT" "YAKUT" "KAPUT" "CANTU" "YALOW"
## [46] "CALYX" "FANCY" "NANCY" "DADDY" "FADDY" "PADDY" "CANDY" "DANDY" "HANDY"
## [55] "GAUDY" "DAFFY" "TAFFY" "TANGY" "CATHY" "KATHY" "JACKY" "TACKY" "WACKY"
## [64] "TALKY" "LANKY" "GAWKY" "DALLY" "PALLY" "TALLY" "WALLY" "WANLY" "HAPLY"
## [73] "LAXLY" "CANNY" "DANNY" "FANNY" "LANNY" "NANNY" "TAWNY" "HAPPY" "NAPPY"
## [82] "PAPPY" "ZAPPY" "CATTY" "FATTY" "NATTY" "PATTY" "TATTY" "NAVVY" "GAUZY"
## [91] "JAZZY" "WALTZ" "VADUZ"

We take “CATCH” from the word list:

The new information is incorporated in the function as follows:

black <- c("R", "I", "S", "E", "M", "B", "A", "T", "C", "H")
wordle(one = "C",
       two = "A",
       three = exclude(c("")),
       four = exclude(c("")),
       five = exclude(c("")),
       include = c("")
       )
## [1] "CAULK" "CANON" "CAPON" "CAJUN" "CALYX" "CANDY" "CANNY"

The list of possible words shrunk considerably. Because I had never before heard the word “CAULK” I took the second word “CANON”:

We can now exclude two more letters, “N” and “O”:

black <- c("R", "I", "S", "E", "M", "B", "A", "T", "C", "H", "N", "O")
wordle(one = "C",
       two = "A",
       three = exclude(c("")),
       four = exclude(c("")),
       five = exclude(c("")),
       include = c("")
       )
## [1] "CAULK" "CALYX"

Interesting… I had to google both words: “CAULK” means “to stop up and make tight against leakage” and “CALYX” is “the usually green outer whorl of a flower consisting of separate or fused sepals”. Uff, I tried the first one and got lucky:

On that day Twitter exploded because people were really angry with the New York Times. Even many native speakers had never heard of this word. You cannot imagine how proud I was having had mastered this challenge with my new little tool!

Now, for one last example:

The way to incorporate the information should be clear by now:

black <- c("R", "I")
wordle(one = exclude(c("A")),
       two = exclude(c("")),
       three = exclude(c("")),
       four = exclude(c("S")),
       five = "E",
       include = c("A", "S")
       )
##  [1] "SPACE" "SAUCE" "SHADE" "SPADE" "OSAGE" "USAGE" "STAGE" "SHAKE" "SLAKE"
## [10] "SNAKE" "SPAKE" "STAKE" "SCALE" "SHALE" "STALE" "SABLE" "SHAME" "SHANE"
## [19] "SHAPE" "SKATE" "SLATE" "SPATE" "STATE" "BASTE" "CASTE" "HASTE" "PASTE"
## [28] "TASTE" "WASTE" "SAUTE" "SHAVE" "SLAVE" "SOAVE" "STAVE" "SUAVE" "SALVE"

We go with the first word from the list…

…and add the info given:

black <- c("R", "I", "P", "C")
wordle(one = "S",
       two = exclude(c("")),
       three = "A",
       four = exclude(c("S")),
       five = "E",
       include = c("")
)
##  [1] "SHADE" "STAGE" "SHAKE" "SLAKE" "SNAKE" "STAKE" "SHALE" "STALE" "SHAME"
## [10] "SHANE" "SKATE" "SLATE" "STATE" "SHAVE" "SLAVE" "SOAVE" "STAVE" "SUAVE"

We again try the first word…

…and put the info into our little function:

black <- c("R", "I", "P", "C", "D")
wordle(one = "S",
       two = "H",
       three = "A",
       four = exclude(c("S")),
       five = "E",
       include = c("")
)
## [1] "SHAKE" "SHALE" "SHAME" "SHANE" "SHAVE"

If we are playing in hard mode, we are in dangerous territory here because we have to use all the information given so far but there are more possibilities than there are guesses left! Ok, let’s give it a go, again with the first word:

Fortunately, we guessed it right away. In easy mode, you can use different words that don’t use all the information given to home in on the solution, as my colleague Professor Timothy Gowers did here:

Timothy is a native speaker and Fields medallist and he needed 5 guesses in easy mode. Not being a native speaker and no maths genius at that I only needed 4 guesses with my little tool (and a little bit of luck) even in hard mode!

I hope that you have fun with the tool – and many wins! Please share your experience with it in the comments and also your ideas on how to improve it!

To leave a comment for the author, please follow the link and comment on their blog: R-Bloggers – Learning Machines.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.