Vocabulary Games

Hi there! Long time no see.

Let’s play a game.

I’m going to give you all the vowels of a word, but none of the consonants. Instead of those, I’m putting empty spaces. The empty spaces are precise—if there’s only one space, there’s only one missing consonant. If two, then two. Then you’re going to guess which word I started with.

Here’s an example:

_ _ e _ e

There. What do you think it is?

Oops. I’ve already given it away. It’s the first word I used after showing you the puzzle. That’s the word I intended to be the solution, at least.

But you probably realized that a lot of other words could’ve worked too. You could’ve answered “where,” “scene,” “theme,” “these,” “crepe,” “abele,” or “prese.” All of those fit the vowel scheme I wrote down (some more possible answers here).

As a side note, “niece” or “sieve” would not have worked, since I would’ve had to show you the “i.” The link I just gave you also includes some of these false positives.

Let’s try a more difficult and interesting vowel scheme, which only has one common solution (a few, technically, but they all share the same root).

_ eio _ i _ e _

Hope you like chemistry (all the answers are at the bottom, if you want to check).

There are some interesting properties to this game.

First, the amount of possible solutions to a given vowel scheme is pretty unpredictable. It follows the obvious pattern of more common vowels usually giving more possible combinations, but their placement matters too.

As a general rule, the simpler the scheme and the less specification, the more words can fit it, up to a point. Vowel schemes that include common combos like

o _ _ e (-orne, -ople, -ophe, -orse)

a _ e (-ane, -ace, -ale)

_ io _ (-tion, -cion, -sion)

also tend to have higher word counts.

In fact, one of the best vowel schemes I found for maximizing possible words is (note it includes a super common a _ e ):

_ a _ e _

Imagine capturing all the countless verbs that would fit the first four letters of that scheme and then effectively tripling that number (e.g. baked, bakes, baker). Then add all the other possibilities.

In cryptographic terms, every empty space adds about 20 more degrees of entropy (since y is usually used as a vowel). This isn’t quite a code, though, so the comparison isn’t great. Vowel scheme solutions always have to be actual words.

Increasing empty space is a good idea to increase the amount of combinations, but again, only up to a point. Few words have three consonants in a row unless the structure is designed to allow for them (coincidentally, the word “three” is part of one such structure) and even fewer have four in a row. Also, multi-letter combos generally have to follow a set of structures which, depending on the word, might end up giving less possibilities than just a single letter (e.g. “tr”, “ch”, “qu”, etc. for two letters).

So changing word length in general is unpredictable, unless you’re at an extreme low or high. Take, for example:

_ o

which can clearly only ever have up to 20 or 21 solutions for all the consonants and possibly ‘y’.

On the other extreme end, you have:

_ e _ i _ e _ i _ e _ i _ ua _ e _

_ _ o _ _ i _ a u _ i _ i _ i _ i _ i _ i _ i _ a _ i o _

Which are so long and convoluted that even without having any idea of the actual word, you can see they should clearly define only one solution (this time I’m sure of it).

But (and you guessed it) there are still exceptions. Some oddly long and specific designations can actually allow for way more words than you might expect. Take, for example:

_ u _ _ i _ a _ io _

How many solutions can you find? Once you get one, the others follow a similar sort of pattern, and you’ll start to see why it supports so many words relative to other vowel schemes of its length.

I’m thinking that even a machine learning/natural language processing solution would have trouble predicting the amount of combinations a given vowel scheme will have. The structure of words feels too unpredictable and organic. I could totally be wrong and I still want to try, but that’s for another day.

Similar Words

The title of this post is vocabulary games. That’s plural. I’ve only got two, but I saved the best for last:

Try to find a word where simply switching one letter drastically changes the meaning. Bonus points for using longer words.

This doesn’t have that many interesting properties (granted, it’s not even really a game), but it can be pretty funny.

Attaching and attacking.

Altercation and alternation.

Clinginess and cringiness.

Heroes and herpes.

Morphine and morphing.

Artistic and autistic.

Revenge and revenue.

There’s a lot of these in English. Find your own.

OR you can write a program to find every pair of English words that are just a single letter apart. I did this, actually.

About a year ago, a friend of mine came up with this “game” and I wanted to take it to its logical end. It took a lot of lines of Python code and a long time to run. Recently, I revisited the project and tried to improve on it with all the programming knowledge I’ve gained over that year:

First, just for bragging rights, I can now do this in one line.

match_dict = {'length_%s_matches'%str(length):[comb for comb in itertools.combinations([w for w in [line.rstrip('\n') for line in open("words.txt", encoding="utf8")] if len(w) == length],2) if len(comb[0])-len([l1 for l1, l2 in zip(*comb) if l1==l2])==1] for length in [7,8,9,10,11,12,13,14,15,16,17,18,19,20]}

This is not a readable, editable, or in any sense advisable way to write code. But when I started shortening it, I immediately wanted know know if this was possible. There you go. All the words would be saved into “match_dict” with the keys being “length_[7,8,9,etc..]_matches”.

Here’s a better method that has readable code:

words = [line.rstrip('\n') for line in open("words.txt", encoding="utf8")] #Removes the line dilineator \n while formatting it into a list called 'lines'
accepted_lengths = [7,8,9,10,11,12,13,14,15,16,17,18,19,20]

def match_finder(array):
    return [comb for comb in itertools.combinations(array,2) if len(comb[0])-len([l1 for l1, l2 in zip(*comb) if l1==l2])==1]

length_dict = {"length_%s_list"%str(length):[w for w in words if len(w) == length] for length in accepted_lengths}
match_dict = {'length_%s_matches'%str(length):match_finder(length_dict['length_%s_list'%str(length)]) for length in accepted_lengths}

And here’s one way to format it into a single file:

with open('Similar Words.txt','w') as similarwords:
    for length in accepted_lengths:
        similarwords.write('### Similar Words of Length %s ###\n\n'%length)
        for pair in match_dict['length_%s_matches'%length]:
            similarwords.write("%s and %s\n" %(pair[0].capitalize(),pair[1].capitalize()))
        similarwords.write('\n\n\n')

If you want to run it yourself, you’re also going to need a list complete with all 400000-odd English words. You can find one online pretty easily, but I got you covered.

Here are the results if you just want to look at those. There’s too much to sort through by myself, so have a look and let me know if you find anything good that I missed.

That’s all my games for now. Happy word-ing.

Answers

Deionized, deionizes, deionizer (Bonus solution: Meionites).
Hemidemisemiquaver (Semidemisemiquaver is an alternate spelling, but I don’t count it as unique).
Floccinaucinihilipilification (Fun fact: this has the most “consonant + i groups” in a row of any word).
Duplication, culmination, publication, lubrication, sublimation, etc.