We can access a specific text within a corpus by using a fileid.
The length of inaugural, that is, len(inaugural.words()) is 145735. However, by putting a fileid, in the call to the words method, we can select only a particular text.
The particular text we selected has a world length of, that is, len(inaugural.words(‘1789-Washington.txt’)) is equal to 1538. We can use the fileids attribute of inaugural, or whatever the corpus happens to be, to get a list with the text names.
The first few words of the first inaugural is printed.
from __future__ import print_function, division
from nltk.corpus import inaugural
A = inaugural.fileids()
s = 2*' '
for a in A[:5]:
B = inaugural.words(A)
for b in B[:20]:
print(b, end = s)
# Fellow - Citizens of the Senate and of
# the House of Representatives : Among the
# vicissitudes incident to life no