Update: NLP – Natural Language Processing

NLP-Anwendungen werden für Unternehmen zunehmend interessant, um große Datenmengen geschriebener oder gesprochener Texte automatisiert zu bearbeiten.

Was ist Natural Language Processing – außer dass ein Sprachassistent mich anspricht?

Natural Language Processing (NLP) is often mentioned along with a related buzz word and another IT discipline. Both have intersections, but are not synonyms. The related catchphrase is “AI” (Artificial Intelligence), the related IT discipline is “Machine Learning”. Machine learning is a statistical approach to “learning” patterns based on so-called training data. It is true that statistical approaches are widespread in NLP, but they are not the only mainstay.

by Sarah Holschneider

There are brand names that jump towards us when we think of “language” and “computer”: “Alexa”, “Google home” or “Siri” – all language assistants use NLP. However, the rapidly growing field of applications for natural language processes can do much more.

NLP is a common branch of linguistics and computer science that deals with the interaction between computers and natural language. With “natural” language is meant the language spoken by humans, in contrast to programming or machine languages.

Between the late 1980s and the mid-1990s, research on natural language processing focused primarily on machine learning . The fact that many people equate the artificial generation of knowledge with machine learning is probably due to the intensive concentration of research and application development on the topic.

Voice Assistants are popular KI devices and need to handle Natural Language Processing
Sprachassistenten sind beliebte KI-Geräte. Sie verarbeiten natürliche gesprochene Sprache.

What defines machine learning 
The algorithms of machine learning are fundamentally able to automatically set up rules by analyzing corpora (huge collections of text) and to learn using typical examples. 

What at first glance looks like a simple way to solve text-related tasks automatically, often requires numerous preprocessing steps by specialists with linguistic knowledge. In some cases, native speakers can already help (provided you have a lot of motivation). In other cases, specialist knowledge of language dependencies and computational linguistics is a necessary requirement. 

Systems based on self-learning algorithms have many advantages over manually written rules. You draw on the possibilities of machine learning. Because they focus on the most common cases, they don’t get caught up in rules for exceptions.

Sarah Holschneider arbeitet seit drei Jahren an der Entwicklung von NLP-Lösungen für L- One Systems, seit März 2020 als Leiterin der NLP-Abteilung.

Systems based on self-learning algorithms have many advantages over manually written rules. You draw on the possibilities of machine learning. Because they focus on the most common cases, they don’t get caught up in rules for exceptions.

In order for the statistics to be effective, however, one needs a sufficiently large data set to produce statistical significance. Rule-based systems are often less scalable, but can close the gap if only small amounts of data are available. Depending on the particular case – namely when a fixed set of possible cases has to be solved – a rule-based system can sometimes even work better.

Would you like to translate a text on a PowerPoint template? With a simple string.replace () function, you could save the investment costs in a statistical approach to machine translation.

»Defining each particular application as specifically as possible is more than half the battle. Many of these use cases are reminiscent of a grammar lesson. At L-One Systems, we therefore involve our linguists during the entire development process. «

You certainly wouldn’t program a calculator without knowing something about algebra either. Why should you leave the development of NLP applications to a team with no linguistic expertise?

More about our NLP projects and L-One Systems