What is Named Entity Recognition?

Named Entity Recognition (NER) is an information extraction method of a technology called Natural Language Processing (NLP). It locates entities in an unstructured or semi-structured text. These entities can be various things from a person to something very specific like a biomedical term. For example NER can recognize that “pancreatic cancer” is a disease and “5-FU” is an intervention. NER plays an important role in enabling machines to understand text. To properly understand NER we will first break down what NLP is. After that we will explain how NER works and highlight new developments in this field.

What is Natural Language Processing?

NLP enables machines to understand language and provides seamless interaction between humans and machines. Earliest signs of NLP can be traced to the 1950s, where Alan Turing proposed that one criterion to measure the intelligence of a machine is to judge its ability to have a “natural” conversation with a human. Since then, NLP has expanded to processing extremely large natural language data sets, with the goal to find new ways of understanding and creating natural language.

NLP can be broken down in three tasks:

Syntax – Understanding the structure and rules of words and sentences
Semantics – Extracting the meaning of words, sentences and their relationship
Speech – Recognizing and segmenting spoken word and converting it to text

Beside the understanding, extraction, and recognition of text and speech, there is a fourth process called Natural Language Generation (NLG). This process differs from the other ones. Most of NLP technologies focus on processing and not creating text and speech.

A text about the development of the company Apple highlighted and classified by NER — NER highlights and classifies a wikipedia text about Apple

How does Named Entity Recognition work?

Now that we explained NLP, we can describe how Named Entity Recognition works. NER plays a major role in semantic part of NLP, which, extracts the meaning of words, sentences and their relationships. Basic NER processes structured and unstructured texts by identifying and locating entities. For example, instead of identifying “Steve” and “Jobs” as different entities, NER understands that “Steve Jobs” is a single entity. More developed NER processes can classify identified entities as well. In this case, NER not only identifies, but classifies “Steve Jobs” as a person. In the following we will describe the two most popular NER methods.

Ontology-based NER

In the past NER strongly relied on a knowledge base. This knowledge base is called an ontology, which is a collection of data sets containing words, terms, and their interrelations. Depending on the level of detail of an Ontology the result of NER can be very broad or topic-specific. Wikipedia, for example, would need a very high level Ontology to capture and structure all their data. In contrast a life-science-specific company like Innoplexus would need a far more detailed ontology due to the complexity of biomedical terms. Ontology-based NER is a machine learning approach. It excels at recognizing known terms and concepts in unstructured or semi-structured texts, but it strongly relies on updates. Otherwise it can’t keep up with the ever-growing publicly available knowledge.

Deep Learning NER

Deep Learning NER is much more precise than its predecessor as it is able to cluster words. This is due to a technique called word embedding, which is capable of understanding the semantic and syntactic relationship between words. Another competitive edge is NER’s feature of deep learning itself. Deep learning can recognize terms & concepts not present in Ontology because it is trained on the way various concepts used in the written life science language. It is able to learn automatically and analyzes topic-specific as well as high level words. This makes deep learning NER applicable for a variety of tasks. Researchers for example can use their time more efficiently as deep learning does most of the repetitive work. They can focus more on research. Currently, there are several deep learning methods for NER available. But due to competitiveness and recency of developments it is difficult to pinpoint the best one on the market. If you are interested in getting a deeper understanding of Deep Learning NER in the clinical field we recommend to read this article.

Conclusion

NER plays a key role by identifying and classifying entities in a text. It is the first step in enabling machines to understand what seems to be an unstructured sequence of words. Nevertheless, it is still a long journey to understand text like a human. One becomes particularly aware of this, when going into detail: As we found out NER understands “Steve Jobs” as a person. However, NER can’t differentiate between all the people called “Steve Jobs”. The next step is assigning a unique identity to an entity. This is done by linking and normalizing. Find out more about Entity Normalization in our upcoming blog post.

Featured Blogs

on September 23, 2020

Cookie	Duration	Description
cookie-checkbox-analytics	11 months	The cookie is used to store the user consent for the cookies in the category "Analytics".
cookie-checkbox-functional	11 months	The cookie is set to record the user consent for the cookies in the category "Functional".
cookie-checkbox-necessary	11 months	The cookies are used to store the user consent for the cookies in the category "Necessary".
cookie-checkbox-others	11 months	This cookie is used to store the user consent for the cookies in the category "Other.
cookie-checkbox-performance	11 months	The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is used to store whether or not a user has consented to the use of cookies. It does not store any personal data.

Latest Posts

30

22

What is Natural Language Processing?

How does Named Entity Recognition work?

Ontology-based NER

Deep Learning NER

Conclusion

Featured Blogs

WHO WE ARE

WHAT WE OFFER

HOW WE WORK

WHY US

Updates

Frankfurt (Germany)

Pune (India)

Iselin (USA)

Cham (Switzerland)