Q: What’s not in WordNet?

A: WordNet only contains nouns, verbs, adjectives, and adverbs. It does not contain pronouns, prepositions, articles, or conjunctions. However, ambiguity can make it look like WordNet contains things it doesn’t, e.g., “it” is found, but it’s really “IT” (“information technology”), “an” is “AN” (“Associate in Nursing”). The auxiliary verbs “be”, “have”, and their respective conjugations (“was”, “are”, “been”, “were” and “has”, “had”) however, are in WordNet. A WordNet stop word list can be valuable in avoiding common, likely incorrect senses. The following set represents a conservative starting point for such a stop list for WordNet 3.0:

it, I, a, an, am, as, are, at, be, been, by, done, has, had, he, me, or, thou, us, was, were, who

WordNet contains only ASCII, no Unicode, so it spells “résumé” “resume” and “cliché” “cliche”.

Q: What’s in WordNet?

A: WordNet is huge; it has strong coverage of common nouns, verbs, adjectives, and adverbs. It also contains a smattering of proper nouns including most nations (e.g., “United States”) and some famous people (e.g., “Einstein”).