Natural Language Tool Kit – Tutorial 10

Wordnet

WordNet is a lexical database for the English language, created by Princeton, and is part of the NLTK corpus.

WordNet can be used alongside NLTK to find the meanings of words, synonyms, antonyms, context, etc.

Below is an example of calling various aspects of Synsets for the word program:-

from nltk.corpus import wordnet

syns = wordnet.synsets("program")
print(syns)

print(syns[0])
#synset 0
print(syns[0].name())

# just the word
print(syns[0].lemmas()[0].name())

# definition
print(syns[0].definition())

# examples
print(syns[0].examples())

So Synsets has a number of entries for ‘program’:-

[Synset(‘plan.n.01’
Synset(‘program.n.02’)
Synset(‘broadcast.n.02’)
Synset(‘platform.n.02’)
Synset(‘program.n.05’)
Synset(‘course_of_study.n.01’)
Synset(‘program.n.07’)
Synset(‘program.n.08’),
Synset(‘program.v.01’),
Synset(‘program.v.02’)]

All relating to different meanings for ‘program’.

Once a Synset is selected its Lemma, definition or examples can be accessed.

Synonyms & Antonyms

Synsets can also be used to generate Synonyms and Antonyms for a given word, in the below example “Good”…

from nltk.corpus import wordnet

synonyms = [] #declare an empty list
antonyms = [] #declare an empty list

for syn in wordnet.synsets("good"):
    for lemma in syn.lemmas():
        synonyms.append(lemma.name())
        if lemma.antonyms():
            antonyms.append(lemma.antonyms()[0].name())

print("Synonyms",set(synonyms))
print("Antonyms",set(antonyms))

Giving the output:-

Synonyms {‘respectable’, ‘salutary’, ‘effective’, ‘dependable’, ‘thoroughly’, ‘beneficial’, ‘unspoilt’, ‘secure’, ‘good’, ‘skilful’, ‘in_effect’, ‘well’, ‘dear’, ‘undecomposed’, ‘serious’, ‘soundly’, ‘goodness’, ‘practiced’, ‘honorable’, ‘safe’, ‘expert’, ‘in_force’, ‘honest’, ‘sound’, ‘full’, ‘trade_good’, ‘skillful’, ‘ripe’, ‘upright’, ‘just’, ‘right’, ‘estimable’, ‘near’, ‘adept’, ‘unspoiled’, ‘proficient’, ‘commodity’}

Antonyms {‘evilness’, ‘evil’, ‘badness’, ‘ill’, ‘bad’}

Semantic similarities

WordNet can also be used to compare the similarity of two words and their tenses, by using the Wu and Palmer method for semantic comparison.

from nltk.corpus import wordnet

w1 = wordnet.synset("ship.n.01") # word.(n)oun.(1)st_occurance
w2 = wordnet.synset("boat.n.01")
print(w1.wup_similarity(w2)) # apllying the Wu & Palmer method

w1 = wordnet.synset("ship.n.01") # word.(n)oun.(1)st_occurance
w2 = wordnet.synset("car.n.01")
print(w1.wup_similarity(w2)) # apllying the Wu & Palmer method

w1 = wordnet.synset("ship.n.01") # word.(n)oun.(1)st_occurance
w2 = wordnet.synset("cat.n.01")
print(w1.wup_similarity(w2)) # apllying the Wu & Palmer method

G-Mac