Fundamentals of NLP: Preprocessing Text Using NLTK & SpaCy
Natural Language Processing
| Intermediate
- 13 videos | 1h 56m 47s
- Includes Assessment
- Earns a Badge
Tokenization, stemming, and lemmatization are essential natural language processing (NLP) tasks. Tokenization involves breaking text into units (tokens), such as words or phrases, facilitating analysis. Stemming reduces words to a common base form by removing prefixes or suffixes, promoting simplicity in representation. In contrast, lemmatization considers grammatical aspects to transform words into their base or dictionary form. You will begin this course by tokenizing text using the Natural Language Toolkit (NLTK) and SpaCy, which involves splitting a large block of text into smaller units called tokens, usually words or sentences. You will then remove stopwords, common words such as "a" and "the" that add little meaning to text. Next, you'll explore the WordNet lexical database, which contains information about the semantic relationship between words. You'll use Synsets to view similar words and explore hypernyms, hyponyms, meronyms and holonyms. Finally, you'll compare stemming and lemmatization using NLTK and SpaCy. You will explore both processes with NLTK and perform lemmatization using SpaCy.
WHAT YOU WILL LEARN
-
Discover the key concepts covered in this coursePerform tokenization with nltkPerform tokenization with spacyRemove stopwords using nltkRemove stopwords using spacyExplore wordnet synsetsCompute similarity of words
-
Explore types of words in wordnetPerform stemming with nltkPerform lemmatization with nltkPerform lemmatization with spacyPerform parts-of-speech (pos) tagging and named entity recognition (ner)Summarize the key concepts covered in this course
IN THIS COURSE
-
2m 16sIn this video, we will discover the key concepts covered in this course. FREE ACCESS
-
12m 49sFind out how to perform tokenization with NLTK. FREE ACCESS
-
12m 56sIn this video, you will learn how to perform tokenization with SpaCy. FREE ACCESS
-
10m 45sDuring this video, you will discover how to remove stopwords using NLTK. FREE ACCESS
-
5m 37sLearn how to remove stopwords using SpaCy. FREE ACCESS
-
5m 55sIn this video, we will explore WordNet synsets. FREE ACCESS
-
10m 18sDiscover how to compute similarity of words. FREE ACCESS
-
12m 26sIn this video, find out how to explore types of words in WordNet. FREE ACCESS
-
9m 51sLearn how to perform stemming with NLTK. FREE ACCESS
-
8m 21sIn this video, you will discover how to perform lemmatization with NLTK. FREE ACCESS
-
12m 18sFind out how to perform lemmatization with SpaCy. FREE ACCESS
-
9m 53sDiscover how to perform parts-of-speech (POS) tagging and named entity recognition (NER). FREE ACCESS
-
3m 23sIn this video, we will summarize the key concepts covered in this course. FREE ACCESS
EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE
Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.
Digital badges are yours to keep, forever.