Detección automática de tweets no relevantes en streams guiados por consulta

García Pérez, Víctor

dc.contributor.author	García Pérez, Víctor
dc.contributor.other	Berlanga Llavori, Rafael
dc.contributor.other	Universitat Jaume I. Departament de Llenguatges i Sistemes Informàtics
dc.date.accessioned	2019-01-29T10:56:34Z
dc.date.available	2019-01-29T10:56:34Z
dc.date.issued	2018-10
dc.identifier.uri	http://hdl.handle.net/10234/180348
dc.description	Treball final de Màster Universitari en Sistemes Intel.ligents (Pla de 2013). Codi: SIE043. Curs acadèmic 2017-2018	ca_CA
dc.description.abstract	Early in the 90s when social networks emerged, the number of users and the amount of information shared and published in them has undergone an exponential growth. In this work we will focus on the social network Twitter, which had at the beginning of 2018 with 330 million users. The goal of this work is to predict which of all the tweets obtained through a domain query are relevant or irrelevant for a subsequent analysis phase. For this, first, a bibliographic search has been made to find out the state of the art on similar topics. Secondly, a semi-manual method has been developed to perform the tagging of the dataset where the tweets have been identified according to the type they belong to, namely: relevant or irrelevant. Then a statistical analysis of the data has been carried out to find an adequate automatic classification method according to the selected evaluation metrics. All the experiments have been carried out with the help of data mining and text processing libraries available for Python.	ca_CA
dc.description.abstract	Desde principio de los años 90 cuando surgieron las redes sociales, el número de usuarios y la cantidad de información compartida y publicada en ellas ha experimentado un crecimiento exponencial. En este trabajo nos centraremos en la red social Twitter, que contaba a principios de 2018 con 330 millones de usuarios. El objetivo de este trabajo es conseguir predecir cuáles de todos los tweets recogidos a través de una consulta de dominio son relevantes o irrelevantes para una fase de análisis posterior. Para ello, en primer lugar, se ha realizado un barrido bibliográfico para consultar el estado del arte en temas similares. En segundo lugar, se ha elaborado un método semi-manual para realizar el etiquetado del dataset donde se han identificado los tweets en función de la clase a la que pertenecen, relevantes o irrelevantes. Después se ha realizado un análisis estadístico de los datos para buscar un método de clasificación adecuado según las métricas de evaluación seleccionadas. Todos los experimentos han sido realizados con la ayuda de las librerías de minería de datos y tratamiento de texto disponibles para Python.	ca_CA
dc.format.extent	55 p.	ca_CA
dc.format.mimetype	application/pdf	ca_CA
dc.language.iso	spa	ca_CA
dc.publisher	Universitat Jaume I	ca_CA
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Internacional	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Màster Universitari en Sistemes Intel·ligents	ca_CA
dc.subject	Máster Universitario en Sistemas Inteligentes	ca_CA
dc.subject	Master's Degree in Intelligent Systems	ca_CA
dc.subject	análisis de redes sociales	ca_CA
dc.subject	clasificación automática	ca_CA
dc.subject	minería de textos	ca_CA
dc.title	Detección automática de tweets no relevantes en streams guiados por consulta	ca_CA
dc.type	info:eu-repo/semantics/masterThesis	ca_CA
dc.educationLevel	Estudios de Postgrado	ca_CA
dc.rights.accessRights	info:eu-repo/semantics/openAccess	ca_CA

Ficheros en el ítem

Nombre:: license_rdf
Tamaño:: 1.194Kb
Formato:: application/rdf+xml

Ver/Abrir

Nombre:: Memoria_TFM_VictorGarciaPerez_.pdf
Tamaño:: 1.803Mb
Formato:: PDF

Ver/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

TFM: Màster Universitari en Sistemes Intel.ligents [94]
SIU043

Mostrar el registro sencillo del ítem

Excepto si se señala otra cosa, la licencia del ítem se describe como: Attribution-NonCommercial-NoDerivatives 4.0 Internacional