Wordnet

WordNet is a lexical database for the English language, which was created by Princeton, and is part of the NLTK corpus.

You can use WordNet alongside the NLTK module to find the meanings of words, synonyms, antonyms, and more. Let's cover some examples.

First, you're going to need to import wordnet:

from nltk.corpus import wordnet

Then, we're going to use the term "program" to find synsets like so:

syns = wordnet.synsets("program")

An example of a synset:

print(syns[0].name())

plan.n.01

Just the word:

print(syns[0].lemmas()[0].name())

plan

Definition of that first synset:

print(syns[0].definition())

a series of steps to be carried out or goals to be accomplished

Examples of the word in use:

print(syns[0].examples())

['they drew up a six-step plan', 'they discussed plans for a new bond issue']

Next, how might we discern synonyms and antonyms to a word? The lemmas will be synonyms, and then you can use .antonyms to find the antonyms to the lemmas. As such, we can populate some lists like:

synonyms = []
antonyms = []

for syn in wordnet.synsets("good"):
    for l in syn.lemmas():
        synonyms.append(l.name())
        if l.antonyms():
            antonyms.append(l.antonyms()[0].name())

print(set(synonyms))
print(set(antonyms))
{'beneficial', 'just', 'upright', 'thoroughly', 'in_force', 'well', 'skilful', 'skillful', 'sound', 'unspoiled', 'expert', 'proficient', 'in_effect', 'honorable', 'adept', 'secure', 'commodity', 'estimable', 'soundly', 'right', 'respectable', 'good', 'serious', 'ripe', 'salutary', 'dear', 'practiced', 'goodness', 'safe', 'effective', 'unspoilt', 'dependable', 'undecomposed', 'honest', 'full', 'near', 'trade_good'} {'evil', 'evilness', 'bad', 'badness', 'ill'}

As you can see, we got many more synonyms than antonyms, since we just looked up the antonym for the first lemma, but you could easily balance this buy also doing the exact same process for the term "bad."

Next, we can also easily use WordNet to compare the similarity of two words and their tenses, by incorporating the Wu and Palmer method for semantic related-ness.

Let's compare the noun of "ship" and "boat:"

w1 = wordnet.synset('ship.n.01')
w2 = wordnet.synset('boat.n.01')
print(w1.wup_similarity(w2))

0.9090909090909091

w1 = wordnet.synset('ship.n.01')
w2 = wordnet.synset('car.n.01')
print(w1.wup_similarity(w2))

0.6956521739130435

w1 = wordnet.synset('ship.n.01')
w2 = wordnet.synset('cat.n.01')
print(w1.wup_similarity(w2))

0.38095238095238093

 

 

How to get synonyms/antonyms from NLTK WordNet in Python?

WordNet is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations.
WordNet’s structure makes it a useful tool for computational linguistics and natural language processing.

WordNet superficially resembles a thesaurus, in that it groups words together based on their meanings. However, there are some important distinctions.

  • First, WordNet interlinks not just word forms—strings of letters—but specific senses of words. As a result, words that are found in close proximity to one another in the network are semantically disambiguated.
  • Second, WordNet labels the semantic relations among words, whereas the groupings of words in a thesaurus does not follow any explicit pattern other than meaning similarity.

# First, you're going to need to import wordnet:
from nltk.corpus import wordnet
 
# Then, we're going to use the term "program" to find synsets like so:
syns = wordnet.synsets("program")
 
# An example of a synset:
print(syns[0].name())
 
# Just the word:
print(syns[0].lemmas()[0].name())
 
# Definition of that first synset:
print(syns[0].definition())
 
# Examples of the word in use in sentences:
print(syns[0].examples())

The output will look like:
plan.n.01
plan
a series of steps to be carried out or goals to be accomplished
[‘they drew up a six-step plan’, ‘they discussed plans for a new bond issue’]

Next, how might we discern synonyms and antonyms to a word? The lemmas will be synonyms, and then you can use .antonyms to find the antonyms to the lemmas. As such, we can populate some lists like:

import nltk
from nltk.corpus import wordnet
synonyms = []
antonyms = []
 
for syn in wordnet.synsets("good"):
    for l in syn.lemmas():
        synonyms.append(l.name())
        if l.antonyms():
            antonyms.append(l.antonyms()[0].name())
 
print(set(synonyms))
print(set(antonyms))

The output will be two sets of synonyms and antonyms
{‘beneficial’, ‘just’, ‘upright’, ‘thoroughly’, ‘in_force’, ‘well’, ‘skilful’, ‘skillful’, ‘sound’, ‘unspoiled’, ‘expert’, ‘proficient’, ‘in_effect’, ‘honorable’, ‘adept’, ‘secure’, ‘commodity’, ‘estimable’, ‘soundly’, ‘right’, ‘respectable’, ‘good’, ‘serious’, ‘ripe’, ‘salutary’, ‘dear’, ‘practiced’, ‘goodness’, ‘safe’, ‘effective’, ‘unspoilt’, ‘dependable’, ‘undecomposed’, ‘honest’, ‘full’, ‘near’, ‘trade_good’} {‘evil’, ‘evilness’, ‘bad’, ‘badness’, ‘ill’}

Now , let’s compare the similarity index of any two words

import nltk
from nltk.corpus import wordnet
# Let's compare the noun of "ship" and "boat:"
 
w1 = wordnet.synset('run.v.01') # v here denotes the tag verb
w2 = wordnet.synset('sprint.v.01')
print(w1.wup_similarity(w2))

Output:
0.857142857143

w1 = wordnet.synset('ship.n.01')
w2 = wordnet.synset('boat.n.01') # n denotes noun
print(w1.wup_similarity(w2))

Output:
0.9090909090909091