nlp18. Stemmer in Python NLTK

We use PorterStemmer for stemming a bunch of words.

Since PorterStemmer is a class as seen from the beggining capital letter (the convention), we have to first make an object and then use a method. Since we will only use the method stem, the object is not stored but only the method.


# nlp18.py
from __future__ import print_function
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer
text = """
cats catlike cat stemmer stemming stemmed stem
fishing fished fisher fish argue argued argues
arguing argument arguments
"""
PS = PorterStemmer().stem
for a in word_tokenize(text):
print('%10s --> %10s' % (a,PS(a)) )

# cats --> cat
# catlike --> catlik
# cat --> cat
# stemmer --> stemmer
# stemming --> stem
# stemmed --> stem
# stem --> stem
# fishing --> fish
# fished --> fish
# fisher --> fisher
# fish --> fish
# argue --> argu
# argued --> argu
# argues --> argu
# arguing --> argu
# argument --> argument
# arguments --> argument

Leave a Reply

Your email address will not be published. Required fields are marked *