Note: I list some offensive words in this post.
I’ve been playing a lot of the new Crossplay Scrabble-style word game from The New York Times, mostly against my sister (and mostly unsuccessfully).
Unlike Scrabble, Crossplay automatically checks word validity on plays, so the game doesn’t feature bluffing or challenges. That got the word nerd in me wondering about its dictionary and playable word list. I did some sleuthing.
These Are Not the Dictionaries You’re Looking For
Before diving into the words allowed in the Crossplay game, understand that the most serious Scrabble players do not use ordinary dictionaries as the authority for playable words.
Tournament-level Scrabble has a lexicon that is large, obscure, sometimes profane, and the product of occasionally awkward negotiation among the game’s manufacturer, dictionary publishers, and the competitive Scrabble community.
On the other hand, casual players turn to standard references such as Merriam-Webster or maybe The Official Scrabble Players Dictionary for the especially dedicated. These dictionaries designed for general use don’t specifically list every game-legal word form, and the common abridged editions omit a large number of obscure or controversial, but playable, words.
Tournament play uses something different: comprehensive, curated word lists derived from dictionaries but authoritative for competitive gameplay. These lists enumerate every legal word form individually, including pluralizations and inflections that ordinary dictionaries treat as grammatical variants rather than separate headwords.
In the U.S., the primary tournament standard is the NASPA Word List, maintained by the North American Scrabble Players Association (NASPA). The current 2023 edition is commonly referred to as NWL2023.
In most English-speaking regions other than North America, competitive Scrabble uses a different lexicon known as Collins Scrabble Words (CSW), derived primarily from Collins dictionaries. The Collins lists are larger and somewhat more permissive than the NASPA lists.
As a result, a word that is legal in one tournament may not be playable in another. Like so many norms, “dictionary authority” in Scrabble is a negotiated concept rather than a simple appeal to Webster or Oxford.
196,419 Words and Counting
Casual word games often sanitize their allowable vocabulary. Many ship with word lists that are thousands of words smaller than standard tournament lexicons.
When it comes to word choices, NYT Games has an another very particular layer of consideration. The paper’s games editors express a distinctive voice across its platform: crosswords, Spelling Bee, Wordle, etc. I’d assumed the Crossplay dictionary similarly would show evidence of an opinionated editorial hand: an emphasis on general knowledge, maybe the addition of some NYT jargon or tics, and almost certainly an aversion to working blue.
The Crossplay app explicitly credits NASPA as a source of its word list. I did some forensics (fancy word, but worth only 14 points in the game) on the Crossplay dictionary and compared it against NWL2023. I was surprised that the two lists turn out to be nearly identical. Specifically, the Crossplay word list contains 196,419 entries. NWL2023 contains 196,601.
That means the Times removed precisely 182 words from NASPA’s list—a difference of less than one tenth of one percent. Crossplay’s word list is almost exactly the same as the one used in North American tournament Scrabble.
So, what do those removed words have in common, and what does the Crossplay database suggest about the game’s design? It’s more interesting than the number of entries alone suggests. Or, at least interesting to a nerd like me.
Examining the Crossplay Dictionary
The Crossplay word list is stored internally in a SQLite database inside the app’s software. To look at it, I extracted the file from a copy of the game‘s iOS .ipa bundle saved from my iPhone.
The table is structured such that each row represents a single playable word and includes a number of associated metadata fields. Exporting the table’s entries produces a dataset with 196,419 rows and 9 columns.
An excerpt of about a dozen words in the A alpha-sort range illustrates the format:
| word | isOffensive | isPlayableByComputer | definition | source | partOfSpeech | frequency | pronunciation | register |
|---|---|---|---|---|---|---|---|---|
| aals | 0 | 1 | an East Indian shrub | 0 | 0.0 | |||
| aardvark | 0 | 1 | a nocturnal burrowing… | 1 | noun | 0.05 | ˈɑrdˌvɑrk | |
| aardvarks | 0 | 1 | a nocturnal burrowing… | 1 | noun | 0.009 | ˈɑrdˌvɑrks | |
| aardwolf | 0 | 1 | a nocturnal black-striped… | 1 | noun | 0.009 | ˈɑrdˌwʊlf | |
| aardwolves | 0 | 1 | a nocturnal black-striped… | 1 | noun | 0.008 | ˈɑrdˌwʊlvz | |
| aargh | 0 | 1 | used as expression of… | 1 | interjection | 0.001 | ɑr(ɡ) | |
| aarrgh | 0 | 1 | aargh | 0 | 0.0 | |||
| aarrghh | 0 | 1 | aargh | 0 | 0.0 | |||
| aas | 0 | 1 | rough, cindery lava | 0 | 0.0 | ˌeɪˈeɪz | ||
| aasvogel | 0 | 1 | a South African vulture | 0 | 0.0 | |||
| aasvogels | 0 | 1 | a South African vulture | 0 | 0.0 | |||
| ab | 0 | 1 | the abdominal muscles | 1 | noun | 1.646 | æb | |
| ars | 0 | 1 | the letter R | 0 | 0.0 | |||
| arse | 1 | 0 | an offensive word | 0 | 0.0 | vulgar |
(Note: I trimmed content in the definition field for the sake of space.)
The first word column defines the playable vocabulary. Matching this column with the NWL2023 headword list allows a direct comparison of the two lexicons. It’s just math:
- NWL2023 entries: 196,601
- Crossplay entries: 196,419
- Difference: 182
Given the number of words we’re talking about, that difference is remarkably small.
What the Crossplay-Brand Word Game Removed
A list of 182 words is pretty easy to scan, and a cursory glance showed that nearly every one of the removed words is historically derived from a trademark, brand, or proprietary product name.
Examples include:
- BENADRYL
- CINEPLEX
- CUISINART
- FORMICA
- JACUZZI
- KLEENEX
- ROLLERBLADE
- TASER
- VELCRO
Entire inflectional families disappear together. For instance:
- BREATHALYZER
- BREATHALYZE
- BREATHALYZED
- BREATHALYZES
- BREATHALYZING
Some of the removed words are widely regarded as genericized trademarks, meaning the original brand name has long since entered common usage. Examples include:
- FRISBEE
- KLEENEX
- LAUNDROMAT
The decision to remove brand names in general makes sense in the context of a commercial software product. Tournament Scrabble lists include these words because they appear in published dictionaries, but a commercial game developer might reasonably prefer to avoid trademark complications entirely.
However, if the issue were potential legal liability for trademark misuse, expired and other legacy brand terms should not have been a cause for concern. Their removal suggests an automated filtering rule applied to the source dataset, rather than a deliberate editorial judgment by NYT Games editors. If the underlying lexical database flagged an entry as trademark-derived, an automated filter would remove it regardless of whether the word has become generic in practice.
Three Black Sheep
A very small handful of (and by small, I mean exactly three) removed words do not obviously derive from trademarks. The words are:
- ADRENALIN
- ADRENALINS
- ASBESTINE
These are simple spelling variants (ADRENALIN for ADRENALINE) and the adjectival form of ASBESTOS. I have no explanation for why these and no other brand-neutral obscurities were removed from the NWL2023 source.
Talk Dirty to Me
The more surprising thing to me was what the Times did not change.
Aside from the trademark filter, the Crossplay lexicon tracks NWL2023 almost perfectly. That includes vocabulary many players might expect the Gray Lady to avoid.
For example:
- ASSHOLE
- COCKBLOCK
- FUCK
- MILF
- TITTIES
All remain valid words in Crossplay.
Before examining the data I had assumed that the dictionary might be somewhat bowdlerized to better suit a family newspaper. Instead, Crossplay adopted the tournament lexicon almost wholesale.
For competitive Scrabble players (along with snickering middle-schoolers and just plain grown-ups who use grown-up words) this is good news. It means the game’s vocabulary basically matches the one used in sanctioned play.
Slurs Not Welcome
You might notice that certain offensive words are absent entirely. Those removals did not originate with Crossplay.
In 2020, NASPA removed about 200 slurs targeting specific categories of personal identity from the allowed tournament list. This decision was somewhat controversial in the community (with the predictable sort of opposition), but it was the right one. NASPA eventually codified the criteria for removal in an official policy. Those changes were incorporated into subsequent lexicon releases, including NWL2023.
In a weird coincidence, the number of slurs removed from NWL (182) exactly matches the number of additional words removed from the Crossplay dictionary. Counterintuitively, this is pure coincidence; the Times’ list is unrelated to word offensiveness, and the two lists of struck words have nothing in common other than count.
NASPA did not completely ban offensive words. As I noted earlier, words inappropriate for polite company but not directed at identity groups largely remain in the lexicon. Moreover, words that have established, non-slur meanings in standard dictionaries also were retained.
Examples of the latter include uncomfortable words like:
- BITCH
- CRACKER
- DYKE
- FAGGOT
The NWL list continues to include these because dictionaries document the neutral sense (dog, snack, levee, bundle of sticks, etc.).
By contrast, the Collins lexicon used in Commonwealth countries has not adopted the same broad removal of slurs, meaning that many offensive words absent from NWL still appear in CSW.
Crossplay’s NWL and Oxford Languages Sources
Unlike most Scrabble word lists, the Crossplay dictionary is more than a list of playable words. Each entry includes substantial linguistic metadata.
Fields include:
- Part of speech
- Pronunciation in IPA
- Usage labels
- Corpus frequency values
- Flags controlling AI behavior
These fields show how the dictionary was assembled. We’ve established that every playable word originates from NWL2023, but most of the definitions and linguistic annotations in the Crossplay data come from Oxford Languages, the dictionary data division of Oxford University Press.
Deducing meaning from the source field in the database:
source= 1 → Definition and metadata are Oxford-sourcedsource= 0 → Definition is NASPA-sourced
Out of the 196,419 entries in the database:
- Oxford definitions: 195,371
- NASPA definitions: 1,048
Oxford and other standard dictionaries typically list only a base lemma and describe inflected forms grammatically rather than creating separate entries for each spelling. Scrabble lists, by contrast, enumerate every playable word individually.
Those 1,048 NASPA entries contain only a brief definition and no other metadata — and for many examples, that definition is simply, “An offensive word.” The 1,048 correspond largely to forms that Oxford does not treat as independent headwords. They are Scrabble-style variants that include technical words, offensive language, non-standard inflections, alternative spellings, or verbalisms.
How the Dictionary Data Appears in the Game
These source-and-register distinctions are visible in the Crossplay interface itself.
When a word’s entry comes from Oxford, the in-game dictionary shows a full lexical entry: definition, part of speech, pronunciation, and usage labels.
When the entry is sourced from NWL instead, the interface explicitly indicates that the definition, if even present, derives from the NASPA word list.
Labels such as vulgar or derogatory also are visible in the app’s dictionary UI and correspond directly to the register column in the database.
Scrabbling for 15 Letters
Another column in the database contains a numeric frequency score that appears to be derived from the Oxford English Corpus, a large database of published English texts. Analyzing this column emphasizes the stark difference between everyday vocabulary and the Scrabble lexicon.
| length | average corpus frequency | occurrences in Crossplay |
|---|---|---|
| 2 | extremely high | 107 |
| 5 | moderate | 9463 |
| 10 | very low | 25002 |
| 15 | near zero | 3839 |
In the real world, the average corpus frequency declines sharply as words grow longer. That’s not the case with the contents of this and other Scrabble dictionaries. Scrabble words are a specialized vocabulary optimized for combinatorial play, not a natural language. The disproportionate number of 10- or 15-letter words in the game dictionary reflects the peculiar composition of Scrabble word lists, which include thousands of rare technical terms from fields such as botany, chemistry, and taxonomy that maximize utilization of the game’s 15×15 grid.
Words like PSITTACINE or ZYZZYVA are perfectly legal Scrabble plays but appear almost never in ordinary English writing.
Words Available to Crossplay’s Computer Opponent
The database includes two flags we can conclude govern in-game behavior:
isOffensiveisPlayableByComputer
1,049 words are marked isOffensive, and 1,413 words are labeled isPlayableByComputer = 0. Examining the two lists shows that every word marked offensive is unplayable by the computer, but not every word marked unplayable is also considered offensive. To be precise, the offensive words are a strict subset that is 364 entries smaller than the total unplayable words. In mathy terms:
isOffensive = 1 ⊊ isPlayableByComputer = 0
The overlap of the two sets reveals several categories of words the computer player can’t use.
First, the AI won’t play words flagged as offensive or derogatory.
Second, it skips certain socially sensitive topics. For example, the Crossplay dictionary includes several abortion-related terms that are legally playable but flagged as unavailable to the computer player.
Finally, the obscurity of the remaining unplayable words suggests the AI avoids vocabulary that would feel implausible to casual players.
In short, the computer player seems to apply a simple filter:
- No offensive language
- No social hot-button topics
- No obtusely uncommon words
There may be additional rules coded in the software, but the in-game behavior I observed generally matches the signals contained in the word database.
None of these constraints apply to human players. You and I are free to use the full lexicon represented in the Crossplay dictionary.
Crossplay Uses a Faithful Scrabble Dictionary
Taken together, the evidence shows that Crossplay’s vocabulary is essentially NWL2023 minus 182 trademark-derived words. Everything else—from obscure botanical Latin to crude profanity—remains intact.
Having said that, the Crossplay dictionary itself turns out to be an interesting window into how the game works.
- The NASPA Word List provides the authoritative list of playable words.
- Oxford Languages supplies definitions, pronunciations, and corpus statistics.
- The Times removed trademark-derived vocabulary.
- Additional metadata guides the behavior of the computer opponent.
For a casual word game published by The New York Times, that is a surprisingly faithful implementation of the competitive Scrabble lexicon. The result is a dictionary that preserves the full richness and eccentricity of the tournament Scrabble vocabulary while still allowing the game’s AI to behave in a way that feels reasonable to ordinary players.
P.S. I made a list of all 182 words removed from NWL2023 for Crossplay for easy download.