Tuesday, April 7, 2020

Algorithms and techniques used in NLP

In this blog, I am going to explain about techniques used in natural language processing.
1. Bag of Words
2. Tokenization

1. Bag of Words
Bag of words is a method to extract features from text documents.These features can be used for training machine learning algorithms.It creates vocabulary of all unique words occurring in all the documents in the training. The approach is very simple and flexible and can be used in myriad of ways for extracting features from documents.



Bag of words is a representation of text that describes the occurrence of words within a document. It involves two things.

1. A Vocabulary of known  words.
2. A measure of the presence of known words.

2.Tokenization

Tokenization is the process of replacing sensitive data with unique identification symbol than retain all the essential information about data without compromising it's security.

The Tokenization system must be secured and validated using security best practices, applicable to sensitive data protection ,secure storage,audit, authentication and authorization.

Tokenization is often used in credit card processing.




No comments:

Post a Comment

Components of NLP

Hello everyone!! In this post I am  going to explain what is NLU and NLG and how three of them works together also techniques used in NLP....