Transforming texts to maps: Geovisualizing topics in texts
Ver/ Abrir
Metadatos
Mostrar el registro completo del ítemcomunitat-uji-handle:10234/158176
comunitat-uji-handle2:10234/71345
comunitat-uji-handle3:10234/141145
comunitat-uji-handle4:
TFG-TFMMetadatos
Título
Transforming texts to maps: Geovisualizing topics in textsAutoría
Tutor/Supervisor; Universidad.Departamento
Ramos Romero, Francisco; Universitat Jaume I. Departament de Llenguatges i Sistemes InformàticsFecha de publicación
2018-03Editor
Universitat Jaume IResumen
Unstructured textual data is one of the most dominant forms of communication.
Especially after the adoption of Web 2.0, there has been a massive surge in the rate of
generation of unstructured textual data. While a ... [+]
Unstructured textual data is one of the most dominant forms of communication.
Especially after the adoption of Web 2.0, there has been a massive surge in the rate of
generation of unstructured textual data. While a large amount of information is
intuitively better for proper decision-making, it also means that it becomes virtually
impossible to manually process, discover and extract useful information from textual
data. Several supervised and unsupervised techniques in text mining have been
developed to classify, cluster and extract information from texts. While text data
mining provides insight to the contents of the texts, these techniques do not provide
insights to the location component of the texts. In simple terms, text data mining
addresses “What is the text about?” but fails to answer the “Where is the text about?”
Since textual data have a large amount of geographic content (estimates of about 80%),
it can be safely reasoned that answering “Where is the text about?” adds significant
insights about the texts. In this study, a collection of news articles from the year 2017
were analyzed using topic modelling, an unsupervised text mining technique. Topics
were discovered from the text collections using Latent Dirichlet Allocation method, a
popular topic modelling technique. Topics are probability distribution of words which
correspond to one of the concepts covered in the text. Spatial locations were extracted
from text documents by geoparsing them. Topics were geovisualized as interactive
maps according to the probability of each spatial location word which contributed to
the corresponding topic. This is analogous to thematic mapping in Geographical
Information System. Coordinates obtained from geoparsed words provide basis for
georeferencing the topics while the probability of such location words corresponding
to the particular topics provide the attribute value for thematic mapping. An interactive
geovisualization of Choropleth maps at the level of country was constructed using the
Leaflet visualization library. A comparative analysis between the maps and
corresponding topics was made to see if the maps provided spatial context to the topics [-]
Palabras clave / Materias
Màster Universitari Erasmus Mundus en Tecnologia Geoespacial | Erasmus Mundus University Master's Degree in Geospatial Technologies | Máster Universitario Erasmus Mundus en Tecnología Geoespacial | text mining | topic modelling | geoparsing | natural language processing | geoparsing | geovisualization | spatial context
Descripción
Treball Final de Màster Universitari Erasmus Mundus en Tecnologia Geoespacial. Codi: SIW013. Curs acadèmic: 2018/2019
Tipo de documento
info:eu-repo/semantics/masterThesisDerechos de acceso
info:eu-repo/semantics/openAccess
Aparece en las colecciones
El ítem tiene asociados los siguientes ficheros de licencia: