Mostrar el registro sencillo del ítem

dc.contributor.authorThapa, Mahesh
dc.contributor.otherRamos Romero, Francisco
dc.contributor.otherUniversitat Jaume I. Departament de Llenguatges i Sistemes Informàtics
dc.date.accessioned2019-12-03T10:36:12Z
dc.date.available2019-12-03T10:36:12Z
dc.date.issued2018-03
dc.identifier.urihttp://hdl.handle.net/10234/185292
dc.descriptionTreball Final de Màster Universitari Erasmus Mundus en Tecnologia Geoespacial. Codi: SIW013. Curs acadèmic: 2018/2019ca_CA
dc.description.abstractUnstructured textual data is one of the most dominant forms of communication. Especially after the adoption of Web 2.0, there has been a massive surge in the rate of generation of unstructured textual data. While a large amount of information is intuitively better for proper decision-making, it also means that it becomes virtually impossible to manually process, discover and extract useful information from textual data. Several supervised and unsupervised techniques in text mining have been developed to classify, cluster and extract information from texts. While text data mining provides insight to the contents of the texts, these techniques do not provide insights to the location component of the texts. In simple terms, text data mining addresses “What is the text about?” but fails to answer the “Where is the text about?” Since textual data have a large amount of geographic content (estimates of about 80%), it can be safely reasoned that answering “Where is the text about?” adds significant insights about the texts. In this study, a collection of news articles from the year 2017 were analyzed using topic modelling, an unsupervised text mining technique. Topics were discovered from the text collections using Latent Dirichlet Allocation method, a popular topic modelling technique. Topics are probability distribution of words which correspond to one of the concepts covered in the text. Spatial locations were extracted from text documents by geoparsing them. Topics were geovisualized as interactive maps according to the probability of each spatial location word which contributed to the corresponding topic. This is analogous to thematic mapping in Geographical Information System. Coordinates obtained from geoparsed words provide basis for georeferencing the topics while the probability of such location words corresponding to the particular topics provide the attribute value for thematic mapping. An interactive geovisualization of Choropleth maps at the level of country was constructed using the Leaflet visualization library. A comparative analysis between the maps and corresponding topics was made to see if the maps provided spatial context to the topicsca_CA
dc.format.extent61 p.ca_CA
dc.format.mimetypeapplication/pdfca_CA
dc.language.isoengca_CA
dc.publisherUniversitat Jaume Ica_CA
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectMàster Universitari Erasmus Mundus en Tecnologia Geoespacialca_CA
dc.subjectErasmus Mundus University Master's Degree in Geospatial Technologiesca_CA
dc.subjectMáster Universitario Erasmus Mundus en Tecnología Geoespacialca_CA
dc.subjecttext miningca_CA
dc.subjecttopic modellingca_CA
dc.subjectgeoparsingca_CA
dc.subjectnatural language processingca_CA
dc.subjectgeoparsingca_CA
dc.subjectgeovisualizationca_CA
dc.subjectspatial contextca_CA
dc.titleTransforming texts to maps: Geovisualizing topics in textsca_CA
dc.typeinfo:eu-repo/semantics/masterThesisca_CA
dc.educationLevelEstudios de Postgradoca_CA
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessca_CA


Ficheros en el ítem

Thumbnail
Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional