Lemmatizing
Lemmatizing is very similar to stemming with the key difference being that lemmatizing ends up at a real word.
from nltk.stem import WordNetLemmatizer
lemmatizer = WordNetLemmatizer()
print(lemmatizer.lemmatize("cats"))
print(lemmatizer.lemmatize("cacti"))
print(lemmatizer.lemmatize("geese"))
print(lemmatizer.lemmatize("rocks"))
print(lemmatizer.lemmatize("python"))
print(lemmatizer.lemmatize("better", pos="a"))
print(lemmatizer.lemmatize("best", pos="a"))
print(lemmatizer.lemmatize("run"))
print(lemmatizer.lemmatize("run",'v'))
Gives the output:-
galiquis@raspberrypi:$ python3 ./nltk_tutorial8.py
cat
cactus
goose
rock
python
good
best
run
run
Some points to note:-
- Lemmatize takes part of the POS parameter/tag so:-
- pos=”a”or ‘a’ will find the closest adjective
- pos=”v” or ‘v’ will find the closest verb
- the default (no option) finds the closest noun
- More powerful than stemming