Stanford Tokenizer


Stanford Tokenizer is an ancillary tool that uses tokenization to provide the ability to split text into sentences. PTBTokenizer mainly targets formal English writing rather than SMS-speak.

Work for Stanford Tokenizer?

Learning about Stanford Tokenizer?

We can help you find the solution that fits you best.

Find the Right Product

Stanford Tokenizer Reviews

Chat with a G2 Advisor
Write a Review
Filter Reviews
Filter Reviews
  • Ratings
  • Company Size
  • Industry
Company Size
Showing 2 Stanford Tokenizer reviews
LinkedIn Connections
Stanford Tokenizer review by G2 User
G2 User
Validated Reviewer
Review Source

"favorite tokenizer"

What do you like best?

I have been using Stanford tokenizer for six years and I love it. It's easy to integrate with any application and can recognize special character like ",", "$" etc. It also has the functionality of removing token matched with some regex. It also has a variety of configuration according to the user's requirements.

What do you dislike?

It converts bracket to other symbols e.g. LCB-, -LRB-, -RCB-, -RRB which sometimes require extra processing later.

What problems are you solving with the product? What benefits have you realized?

NLP related problems.

Sign in to G2 to see what your connections have to say about Stanford Tokenizer
Stanford Tokenizer review by G2 User
G2 User
Validated Reviewer
Review Source

"The simplest tokenizer to implement for NLP problems"

What do you like best?

Ease of use and implementation and works effectively in most cases. Open source license and straightforward algorithm.

What do you dislike?

There are more powerful tools out there like spaCy which use deep learning techniques to identify more information like context in a sentence.

What problems are you solving with the product? What benefits have you realized?

Tokenize OCR data to pre-process and pass to machine learning models. Works fast and is accurate for real time applications.

What Natural Language Processing (NLP) solution do you use?

Thanks for letting us know!

There are not enough reviews of Stanford Tokenizer for G2 to provide buying insight. Below are some alternatives with more reviews:

NVivo Logo
NVivo is software that supports qualitative and mixed methods research. It's designed to help organize, analyze and find insights in unstructured, or qualitative data like: interviews, open-ended survey responses, articles, social media and web content.
SnatchBot Logo
SnatchBot is a bot builder platform designed to add multi-channel messaging to any system.
Google Cloud Translation API Logo
Google Cloud Translation API
Google Translate API is a tool that provides a programmatic interface for translating an arbitrary string into any supported language.
Microsoft Bing Spell Check API Logo
Microsoft Bing Spell Check API
Microsoft Bing Spell Check API is a tool that help users correct spelling errors, recognize the difference among names, brand names, and slang, as well as understand homophones as they're typing.
FuzzyWuzzy Logo
FuzzyWuzzy is a Fuzzy String Matching in Python that uses Levenshtein Distance to calculate the differences between sequences
Amazon Comprehend Logo
Amazon Comprehend
Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find insights and relationships in text. Amazon Comprehend identifies the language of the text; extracts key phrases, places, people, brands, or events; understands how positive or negative the text is; and automatically organizes a collection of text files by topic.
IBM Watson Tone Analyzer Logo
IBM Watson Tone Analyzer
IBM Watson Tone Analyzer is a service that uses linguistic analysis to detect three types of tones from text: emotion, social tendencies, and language style, emotions identified include things like anger, fear, joy, sadness, and disgust, identified social tendencies include things from the Big Five personality traits used by some psychologists includi openness, conscientiousness, extroversion, agreeableness, and emotional range and identified language styles include confident, analytical, and tentative.
spaCy Logo
spaCy is a Python NLP library that helps user get their work out of papers and into production.
openNLP Logo
Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text that supports the common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution these tasks are usually required to build more advanced text processing services and includes maximum entropy and perceptron based machine learning.
Microsoft Language Understanding Intelligent Service (LUIS) Logo
Microsoft Language Understanding Intelligent Service (LUIS)
Microsoft Language Understanding Intelligent Service (LUIS) is a service that enable user to quickly deploy an HTTP endpoint that will take the sentences being send and interpret them in terms of the intention they convey and the key entities that are present, it has a web interface that can custom design a set of intentions and entities that are relevant to an application and guide ser through the process of building a language understanding system.
Show more
Kate from G2

Learning about Stanford Tokenizer?

I can help.
* We monitor all Stanford Tokenizer reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. Validated reviews require the user to submit a screenshot of the product containing their user ID, in order to verify a user is an actual user of the product.