Natural Language Processing in Textual Information Retrieval and Related Topics
Article Sidebar
Main Article Content
"Natural Language Processing" (NLP) as a discipline has been developing for many years. It was formed in 1960 as a sub-field of Artificial Intelligence and Linguistics, with the aim of studying problems in the automatic generation and understanding of natural language.
At first its methods were widely accepted and successful. However, when applied in controlled environments and with a generic vocabulary, many problems arose. Among those problems were polysemy and synonymy.
In recent years contributions to this field have improved substantially, allowing for the processing of huge amounts of textual information with an acceptable level of efficacy. An example of this is the application of these techniques as an essential component in web search engines, in automated translation tools or in summary generators [Baeza-Yates, 2004].
This article aims to review the main characteristics of natural language processing techniques, focusing on its application in information retrieval and related topics Specifically, in the second section we will study the different problems in automatic natural language processing; in the third section we will describe the key methodologies of NLP applied in information retrieval; and in the fourth section we will state several fields of research related to information retrieval and natural language processing; finally we present the conclusions and an annexe (Annexe 1) showing some of the particular aspects of NLP in Spanish.
Article Details
Copyright

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0).
Most read articles by the same author(s)
- Mari Vállez, Mario Pérez-Montoro, Journalistic communication in times of pandemic: analysis of the treatment of COVID-19 in the European press , Hipertext.net: No. 21 (2020): COVID-19 and Communication
- Lluís Codina, Rafael Pedraza, Javier Díaz Noci, Ruth Rodríguez-Martínez, Mario Pérez-Montoro, Víctor Cavaller-Reyes, Articulated System to Analyse Digital Media (ASADM): a proposal about what and how to study online newspapers , Hipertext.net: No. 12 (2014)
- Pere Freixa, Lluís Codina, Rafael Pedraza, Cristòfol Rovira, Presentación del número especial "COVID-19 y comunicación" y una nota a favor de la Ciencia Abierta , Hipertext.net: No. 21 (2020): COVID-19 and Communication
- Mari Vállez, Carlos Lopezosa, Addressing the concept of web visibility: What it is, where it comes from, and where it’s heading , Hipertext.net: No. 28 (2024)
- Mari Vállez, Cristòfol Rovira, Lluís Codina, Rafael Pedraza, Procedures for extracting keywords from web pages, based on search engine optimization , Hipertext.net: No. 8 (2010)
- Mari Vàllez, PhD dissertation – Summary. Exploration of semiautomatic procedures for the indexing process in the web environment , Hipertext.net: No. 15 (2017): Interaction and Digital Media
- Mari Vállez, Mari-Carmen Marcos, Libraries in a Web 2.0 environment , Hipertext.net: No. 7 (2009)
- Ruth Rodriguez-Martinez, Rafael Pedraza, On-line Media and Web 2.0 , Hipertext.net: No. 7 (2009)