INTERNET OF THINGS DATA VISUALIZATION FOR BUSINESS INTELLIGENCE

This study contributes to the research on Internet of Things data visualization for business intelligence processes, an area of growing interest to scholars, by conducting a systematic review of the literature. A total of 237 articles published over the past 11 years were obtained and compared. This made it possible to identify the top contributing and most influential authors, countries, publishers, institutions, papers and research findings, together with the challenges facing current research. Based on these results, this work provides a thorough insight into the field by proposing four research categories (Technology infrastructure, Case examples, Final-user experience, and Big Data tools), together with the development of these research streams over time and their future research directions.


INTRODUCTION
Business Intelligence (BI) is a kind of information system for gathering, manipulating, storing and analysing raw data and transforming it into useful information for managers enabling them to make better and faster decisions and discover new business prospects (Al-Eisawi et al., 2020;Liang and Liu, 2018).BI systems complement corporate operational information systems such as Enterprise Resource Planning (ERP), Supply Chain Management (SCM), Customer Relationship Management (CRM), etc. and offer organizations a great potential to improve organizational efficiency (Wang and Byrd, 2017).
A BI system is composed of three components (Laursen and Thorlund, 2010): a technological component, which includes a broad range of analytical software for diverse organizational provisions (Loon, 2019); a human component, consisting of system developers and managers with a high level of analytical skills; and the business process, which will underlie the transformation of information into knowledge.The evolution of these three components has allowed three generations of BI systems to be established.
The first generation of BI appeared in the 1990s.It was characterized by highly formatted reports developed by IT personnel using proprietary BI tools embedded into companies' desktop or client/server applications, and communicated by proprietary application programming interfaces (APIs).The second generation of BI started in the 2000s.Using data warehouses and dashboard-building tools, analysts and business users were able to access large amounts of structured and unstructured data and could create intuitive dragand-drop reports by themselves (Innovative enterprise, 2020).Data were usually extracted from corporate operational computer systems and then transformed and loaded into an enterprise data warehouse and centralized metadata repository, using extraction transformation migration and loading (ETML) tools, such as Apache Spark.The next step was to cluster them in Data Marts.On-line analytical processing (OLAP) tools, such as OBIEE (Oracle Business Intelligence Enterprise Edition) and IBM Cognos were used to calculate and display indicators in user interfaces such as dashboards, spreadsheets, etc. (Chalmeta and Grangel, 2005).Today, the third generation of BI is characterized by (1) organizations' awareness of the possibilities of BI to create a competitive advantage and therefore of the need to develop a Data-Driven Business Strategy where data are used to create more value, to keep costs down, to drive additional sales, to engage customers more fully, and to improve process efficiency; and (2) the exponentially increasingly multistructured data sets that companies have to analyse in an efficient manner, Internet of Things (IoT) sensors and web 2.0 tools being the major contributors (Patel and Sharma, 2020).
An integral part of today's generation of BI systems is IoT data visualization for business analytics processes.IoT provides massive amounts of data emitted from multiple connected sensors that, once gathered, analysed and displayed, can be used by managers to support their decision-making.Data visualization is essential in this process, since it allows managers to gain a proper understanding of the underlying patterns and results obtained by analysis algorithms (Lavalle, 2019).Data Visualization in BI must consider the business objectives and the evolving needs of users, taking into account high-level semantics, reasoning about unstructured and structured data, and providing simplified access and a better understanding of data (Aufaure, 2013).
However, although IoT data visualization enhances the productivity and efficiency of the business, different authors have shown that there are numerous challenges and factors that affect the use of IoT data visualization for business intelligence processes that must be identified and studied (Kumar, 2020).These include volume and variety of data, heterogeneity of devices, necessity of scalable and efficient storage infrastructures, data security and privacy, reliable validity of marketing segmentation, inexperienced users who do not know what type of information they want to extract from data or which would be the best type of visualization or who are wrong in their interpretation (Gray et al., 2017), and financial management issues.
To date, no literature review has examined the IoT data visualization for BI processes.To address this gap, this paper synthesizes the body of knowledge on IoT data visualization for BI processes and establishes research categories that bring together research conducted on the basis of relevant common points.In particular, in this study the following research questions (RQs) are posed: RQ1: Which are the top contributing authors, countries, papers, institutions and publishers in the field of IoT data visualization for BI processes?RQ2: Is it possible to define research categories on the basis of relevant common points?
RQ3: What are the future research necessities in the field of IoT data visualization for BI processes?
To answer the above research questions, this paper (1) carries out a systematic review of the literature on IoT data visualization for BI processes, since it is an efficient research method that allows a precise evaluation of the information published to date (Levy and Ellis, 2006) ; (2) provides a thorough insight into the field by using bibliometric analysis techniques to evaluate 237 published articles, and to identify top contributing authors, papers, countries, publishers and institutions related to the field; (3) identifies and proposes four established and emerging research categories that would encourage scholars to expand research on IoT data visualization for BI processes; and (4) identifies the future research necessities in each research category.This paper is organized as follows: Section 2 describes the software tools and the research methodology used to perform the bibliographical analyses.Section 3 offers the findings of the bibliographical analyses.Finally, section 4 discusses the findings as well as the conclusions, research limitations and future work.

RESEARCH METHODOLOGY
To answer the above-mentioned research questions, a semi-structured literature review was selected because it makes it possible to understand the state of knowledge, to identify the historical evolution and to develop a research agenda (Snyder, 2019).
To provide insights into the main topics, the machine learning algorithm Latent Dirichlet Allocation (LDA) was applied for topic modelling instead of other topic modelling such as Latent Semantic Analysis (LSA) or Probabilistic Latent Semantic Analysis (PLSA).LDA was selected because it is the simplest and most popular (Wallach, 2006), it can be applied to different kinds of problems, and it can be used to quickly identify thematic clusters in large documents (Maier et al., 2018).LDA works by assuming that each document is a probability distribution of topics and each topic is a probability distribution of words from the document.The idea is that documents are "rep-resented as random mixtures over latent topics, where each topic is characterized by a distribution over words" (Blei, 2003).LDA is based on three concepts: the corpus (the text collection), the document (one item within the corpus), and the terms (the words within a document).Then, the aim of the LDA algorithm is to infer topics from recurring patterns of word occurrence in the documents.Topics are heuristically located on an intermediate level between the corpus and the documents and can be imagined as content-related categories, or clusters (Pirola et al., 2020).Five steps were carried out to apply LDA (figure 1): the data sources, which in this case were Scopus and Web of Science.These two databases are the main sources of bibliographic citations used for bibliometric analyses.This is mainly because they are the only ones that combine both a rigorous selection process and wide interdisciplinary coverage, which make them significantly stronger than the other databases (Martínez-López et al., 2018).The selection of articles and reviews was carried out by identifying those that had certain keywords in the title, in the keywords section or in the abstract (Table 1).The keywords used were organized in three sets: IoT, Data visualization and BI, with all the possible synonyms and terms related to them.The search was conducted in October 2020 and the results were limited to papers in English published in journals and at conferences as of 2009.The fields of research were also limited to the following areas: Business, Management and Accounting; Computer Science; Economics, Econometrics and Finance; and Engineering.

Keywords Content Period Document Category
Scopus ("IoT" OR "Internet-of-Things" OR "Internet of Things" OR "Internet of Everything" OR "Industrial Internet" OR "smart product" OR "smart object" OR "IIoT" OR "Industrial Internet of Things" OR "Industrial IoT" OR "I4.0"OR "I 4.0" OR "Industry 4.0" OR "Fourth Industrial Revolution") AND ("data analysis and visualization" OR "Data visualization" OR "information visualization" OR "Visualization Technique" OR "visual exploration of patterns" OR "graphic representation" OR "visual depiction of data" OR "graphic portrayal") AND ("BI" OR "Business Intelligence" OR "Business Analytics" OR "competitive intelligence" OR "BI&A" OR "enterprise strategy" OR "company strategy" OR "firm strategy" OR "managers" OR "decision making" OR "performance systems" OR "balanced scorecard" OR "enterprise knowledge" OR "company knowledge" OR "firm knowledge") To generate significant topic modelling, data in the corpus was cleaned.Different methods were used to do so: removing punctuation and tokenization, making bigrams and trigrams, and the so-called stop words, and lemmatizing to convert words to their meaningful base form.

Model selection.
Once the corpus had been cleaned, the LDA algorithm developed by Blei and Hoffman (2010) was implemented for topic modelling.This algorithm is based on on-line stochastic optimization, and learns the Dirichlet hyperparameter α document-topic density directly from the data.The Dirichlet hyperparameter β word-topic density was fixed at the value of 1/K as proposed by the topic model library that was used: gensim (Rehurek and Sojka, 2010).The number of topics K was set taking into consideration the topic coherence metric.

Topic validity and labelling.
To ensure the validity and interpretability of the results, a content analysis was carried out independently by the authors with the objective of checking whether the topics depict relevant research issues and if a topic should be discarded or whether two or more topics should be merged.Once the topics had been validated, a label was assigned to each of them.

Presentation of findings.
After the identification, validation and labelling of topics, the analysis tools available at Web of Science and Scopus were used to perform a biographical analysis using descriptive statistics.Finally, a content analysis of the papers belonging to each of the topics was carried out to identify the main research conducted, the main conclusions, and the future research directions in all the research categories.

Initial Results
The initial search was conducted in October 2020 and resulted in 230 journal papers and 194 conference papers.The initial set of papers were filtered further by reading the titles and abstracts to check coherence with the research questions and eliminate duplications.This filtering led to the final set of 126 journal papers and 111 conference papers, giving a total of 237 papers.
Figure 2 shows the evolution in the number of publications.It has increased considerably in the last six years, and a boom can be highlighted as of 2019 and 2020.
Figure 2. Trend in the generation of articles A bibliometric analysis was carried out over these 237 papers.It allowed to identify the top contributing authors, countries, papers, institutions and publishers in the field of IoT data visualization for BI processes (see appendix).Findings shows that (1) there is a small number of experts in the field; (2) the distribution by countries reveals the leadership of China, followed by the United States, India and Germany; (3) the most cited papers are focused in Big Data; (4) none institution is detected that stand out significantly in terms of the number of publications and (5) three sources stand out over the others: ACM International Conference Proceeding Series with 14 publications, followed by Advances in Intelligent Systems and Computing publications, and Procedia CIRP both with 11 publications.

Research categories
Eight LDA models were calculated, the number of topics K varying between 3 and 10.
From these, the model with 6 topics was selected because it was the one with the highest coherence value.Then, the validity and the interpretability of the topics were assessed and, as agreed by the authors of this paper, two topics were merged in one topic, and other two topics were merged in one topic too, since the papers discussed similar issues.The four topics are: Topic 0, Technology infrastructure.This focuses on the technology architecture/infrastructure of the system and proposes available analysis methods, tools and technologies.
Topic 1, Case examples.This focuses on analysing and proposing solutions (ad-hoc or general) for real problems in different areas of enterprise/industry.
Topic 2, Final-user experience.This topic focuses on the final-user experience.
Topic 3, Big Data tools.This topic focuses on how Big Data tools/methods can be used in this field.It seems that Big Data is very important in this field and requires a topic of its own.
Table 2 reports the four validated topics identified along with the top 10 words and the number of papers.

State of the art in each category
In the following, the main results achieved in each category are shown (figure 3).On considering 53 papers on the topic "Technology infrastructure", which are primarily focused on transformation towards industry 4.0 by means of Information Technology, two major research lines can be inferred: one involves papers that use modern technologies in order to manage energy consumption and to achieve more sophisticated monitoring methods, and the other comprises papers that made an effort to enhance optimization and productivity in businesses.
In the first research line, several papers paid special attention to monitoring buildings and construction sites.Song et al. (2018) developed an IoT platform that is able to monitor and refine the air conditioning operation habits in a house.In their development, they employed a deep belief network algorithm to analyse the data and recognize consumption patterns, a cloud server to collect the data, and visualization techniques to display power consumption.This resulted in a better user experience and a reduction in energy consumption.Zach et al. (2015) presented a scalable approach to monitoring buildings and processing the data by means of data pre-processing algorithms and virtual data points.The good point about their work is that software interfaces are independent of the hardware and can be supported simultaneously, which allows batch processing for various applications.Another interesting study was that carried out by Jin et al. (2020), who attempted to eliminate some of the problems on a construction site, including the detection and location of errors and the identification of intruders, by developing an intrusion monitoring system based on IoT.In order to develop their system, they took advantage of five key components, including radio-frequency identification (RFID) triggers, safety hardhats, a backend cloud server that would be contacted by a smartphone application, and a web-based management platform.Another monitoring system, proposed by Zhang et al. (2019), was capable of recognizing non-hardhat use, which is considered a crucial safety measure on construction sites.In their IoT-based proposal, they took advantage of sensors, RFID triggers, smartphones, a web-based application for data visualization, and a cloud server that would be used for storing and retrieving the collected data.In terms of consumption management, Assad et al. (2019) introduced a framework, based on the use and incorporation of virtual models, to predict the key performance indicators in energy consumption in manufacturing systems before being set up.To this end, they used the VueOne virtual engineering (VE) tool with the aim of achieving the best energy-efficient production systems.Donnal et al. (2016) believed that in order to monitor, measure and control power consumption, high-performance low-cost computers should be employed to make local analysis possible.They proposed the construction of an energy box around these computers as a non-intrusive load monitor (NILM) in order to provide the end-user with visualization of power consumption and the generation of custom reports by means of an integrated scripting engine.
Other papers in the same research line focused their state-of-the-art monitoring approaches on other areas.Gao et al. ( 2018), Kurniawan et al. (2018) and Pachayappan et al. (2020) proposed an IoT-based management system for farmlands that used multiple sensors to gather monitoring data belonging to the weather, soil and plant conditions such as soil humidity, pH, the wetness of leaves, etc., with the aim of ensuring the satisfactory situation of the planting environment, improving the economic return and achieving better control over the state of the planted crops in real time.The studies by Gao et al. ( 2018) and Kurniawan et al. (2018) were published in the same year and were somewhat similar but with some differences, especially in the hardware they used.Thus, the former employed RFID and ZigBee, and the latter implemented Wemos D1 mini and Arduino.Another paper by Alam et al. (2017) presented an integration of the IoT with augmented and virtual reality technologies in order to monitor and maintain the system and ensure the safety of the personnel in an extreme environment.Their emphasis was on using mobile computing equipment on the workers' side and processing real-time data.While the last four papers mentioned aimed to deal with real-time data, Tan et al. (2017) managed to perform data mining and to analyse and visualize historical data on the quality of the manufactured products in order to discover and monitor the performance trend of the business.Additionally, Benedetto et al. (2020) attempted to solve some consumption management problems by deploying sensors on the production line.
The second research line is mostly concerned with enhancing optimization and productivity in businesses.In an attempt to optimize various areas of smart sustainable cities such as freight logistics and citizens' transportation, Beneicke et al. (2020) suggested using analytical tools, namely hybrid simulation-optimization and machine learning algorithms, to analyse the data in order to enhance citizens' insights and cognition.Apart from that, in order to attain optimization of production and productivity in the manufacturing lines of factories, Jinushi et al. ( 2016) suggested a comprehensive system based on IoT that covers the collection, aggregation and visualization of data, with the intention of lessening the workers' responsibility and supporting decision-making.Yu et al. (2018) exploited information technology to propose a BIM-based smart management model for a construction site in order to increase productivity and efficiency.They deployed IoT, cloud computing and Big Data analytics to attend to the issues related to data analysis and storage, augmented and virtual reality, digital processing to preprocess the components before transporting them to the construction site, and threedimensional scanning to detect probable errors in measurements.To make optimal decisions, Goti-Elordi et al. ( 2017) proposed the application of a Business Intelligence tool to manage Big Data in the food industry.Finally, in order to have a productive business, the discover-innovate-predict-perform-sustain (DIPPS) model based on the analysis of data was introduced by Rane and Mishra (2018).In addition to Topic 1 papers, Table B1 in appendix B also includes proposals of papers that belong to other topics and have some kind of example of application.These proposals include the topic number of their paper in brackets.They have been included in Table B1 because in this way the table contains all the real applications of IoT data visualization for BI processes that have been carried out so far.Therefore, this information is gathered in one place for practitioners.
The second most addressed topic in the pool of papers is "Final-user experience", with 61 papers, which are primarily focused on data visualization techniques and their application.These papers addressed three noticeable categories: 1) those that made an effort to facilitate human-computer interactions through visualization and simulation and to improve user experience, 2) those intended to assist users in management and monitoring by carrying out intelligence in buildings, factories, health, etc., and 3) those that attempted to assist managers with intelligent decision-making.
Within the first research line, there are several examples of papers that are mainly focused on human-computer interactions using visualization methods.Yun et al. (2020) attempted to support decision-makers in industry by proposing a novel visual humancomputer interaction decision-making system based on data mining techniques.In order to evaluate the performance of the proposed method, the authors applied various data mining algorithms and assigned different values to the key parameters, and then proved that their method was robust and effective.Additionally, Shao et al. (2019) introduced an IoT-Avatar architectural framework based on mixed reality for human-computer interactions in an IoT system.Their system was adaptive, flexible and engaging and also included a method for two-way communication between the IoT system and the representation of the virtual avatar character to deal with the bandwidth deficiency within IoT system communication.Moreover, Pfeffer et al. (2015) believed in using interactive surfaces as a means to facilitate collaborative work, as well as virtual and augmented reality to help in problem-solving by representing a product virtually.They illustrated a comparison between possible future technologies deployable in manufacturing plants and the current methods to demonstrate how plant control is going to be changed.They believed that the integration of data collected from various steps in a product life cycle may help the internal and external stakeholders share the information in a collaborative manner.Finally, Rubart et al. (2017) presented an interactive BI digital boardroom that would enhance user experience and promote the level of interaction between analysts and planners.They made use of a multi-display environment along with multi-touch and multi-user interaction approaches to display data visualizations.
The second research line pivots on management and monitoring by taking advantage of visualization methods in various business sectors.As regards smart factories, Gu and Gao (2020) proposed a visual particle system in digital twin capable of predicting and monitoring industrial production processes and would succeed in reducing production costs.Their system is helpful in terms of the limitations of current digital twin systems regarding simulation and is capable of simulating more complex objects such as gas and fluid.In addition to this, in order to prioritize end-user attention in high-volume fast data streams, Abuzaid et al. (2018) 2020) developed an IoT platform with the help of sensors that send the data to the cloud, and then the data are analysed via activity recognition algorithms.Finally, the results would be visualized through a web-based system and necessary alerts or notifications would be issued for doctors or a user's family.Marques and Pitarma (2020) presented an approach based on IoT to monitor the environmental noise in a building, since serious health issues can originate from noise pollution.Then they visualized the collected data through web software to assist decision-makers with taking appropriate measures.Additionally, Yang et al. (2013) introduced a system called VisOSA to monitor patients suffering from a chronic disorder called Obstructive Sleep Apnea (OSA).VisOSA is a web-based application that allows patients to assess their health condition and physician staff to monitor their patients both individually or in a group.Finally, to develop a health monitoring system, Elouni et al. (2020) proposed integrating a Remote Health Monitoring Systems (RHMS) that uses multi-agent technology with machine learning approaches to deal with the temporal aspect of real-time health data, to extract knowledge from collected data, and to predict the patients' state.
Concerning the third research line in this topic, almost all the papers in our pool mentioned supporting decision-makers in some way, though some of them dedicated their work to providing support for decision-making.Ltifi et al. (2020) and Alves et al. (2020) managed to combine data mining techniques with data visualizations to enhance decisionmaking.Ltifi et al. (2020) claimed that valuable patterns can be extracted from data by employing Decision Support Systems (DSS) based on data mining.To transit from these patterns to knowledge, they proposed a generic approach to help decision-makers take advantage of this knowledge, by using a common visual analytics process.They applied their proposal to a medical case in the Intensive Care Unit and proved its feasibility.Additionally, Teong et al. (2018) aimed to discover a way to enable decision-makers not only to explore data but also to gain deeper insights from visualized data.They showed that interactive visualization can do the trick and proved the effectiveness of their proposal by applying it to an airline to predict flight delays.Moreover, Hingant et al. (2018) offered an enhanced intelligence system called HYBINT that would assist decision-makers with watching over their all-important instruments.They supplied their system with cyber and physical heterogeneous data, and then the output of the analysed data would be represented through visualization techniques in a single visualization space.They applied their work to a real environment and proved that it can enhance situational awareness.Finally, an interactive visual analytics approach called PlanningVis was proposed by Sun et al. ( 2020), which exploits production Big Data to modify and optimize production planning according to real-time data and enhance decision-making.
Through their system, exploration and comparison of production plans were possible at three levels of detail, namely, plan overview, product view and production detail view.
The authors proved the effectiveness and usability of their proposal through two case studies.
Topic 3, Big Data tools.
The topic "Big Data tools" is one of the most addressed topics in the academic literature (32% of all papers analysed).It is possible to classify them into two main sub-categories: 1) Papers that introduce a comprehensive framework of Big Data analysis that mostly covers from data acquisition to knowledge acquisition.2) Papers with a more focused view that discuss the application of visualization techniques on Big Data.In both categories, the authors are mostly seeking to achieve intelligence, and sometimes they manage to integrate other technologies such as cloud computing, Machine learning, Augmented reality, virtual reality, etc. to overcome deficiencies like dealing with heterogeneous data, real-time data, rapid change of data, storing huge amounts of data, etc.
Within the first research line, the papers mostly pivot around two business sections, namely smart factories and smart cities.As far as smart factories are concerned, Jung et al. (2019) suggested a Big Data analysis framework in which data is collected from three different stages: the distribution stage in which products are distributed, the customer usage stage in which products are used by consumers, and the A/S stage in which products are repaired by repair shops.Data is then analysed and visualized and the analysis output is subsequently handed over to the companies to assist them with improving efficiency at each stage.In addition to that, Campos et al. (2017) investigated the characteristics of data and Big Data and highlighted how manufacturers may implement Big Data analytics and technologies, such as data mining, as well as data visualization techniques in their organizations in order to convert the data into information and manage their assets.Moreover, Yu et al. (2020) 2017) also designed and implemented a visual analytic system to investigate tremendous amounts of data generated in the assembly lines of factories.
Their system would assist with monitoring the performance of assembly lines in real time and the investigation of historical data to uncover anomalies and deficiencies and to aid with finding the reasons for them.Moreover, an interactive visual analytics system was designed by Wu et al. (2018) to allow monitoring of the equipment in a factory in the process industry and thus avoid unplanned downtime and unnecessary routine maintenance.They deployed advanced analytical algorithms and intuitive visualization designs to provide a semi-supervised approach to monitor the condition of the equipment.Finally, in order to evaluate the behaviour of the Advanced Driver Assistance System (ADAS), Priyadarshini et al. (2019) proposed an interactive GUI to perform the analysis and visualization of the tremendous amount of data collected from sensors and vehicles in the automotive industry.With regard to cities, Bouloukakis et al. (2019) proposed an interactive data visualization framework for smart cities to transit from static IoT data visualization to interactive IoT data visualization.To accomplish this framework, they took advantage of advanced user interaction techniques and Virtual Reality and overcame difficulties with data complexity and heterogeneity.Moreover, Lock et al. (2019) developed an application for visualizing real-time and historical Big Data on the city transportation that applies Augmented Reality and would help with assessing the transportation performance and the results would be beneficial for a wide range of people, namely decision-makers, city planners and citizens.Furthermore, Bornschlegl et al. ( 2018) introduced an approach to analyse heterogeneous car-to-cloud data through comprehensive visualization so that they can detect anomalies.Concerning agricultural industries, Wu et al. (2018) proposed a multi-dimensional information visual analysis approach for market sales Big Data which employs a density-based clustering algorithm that would expel excessive data and just keep the effective information, aiming at providing the agricultural industries with a clear vision about market status and trends, and assisting with smart decision-making.Regarding the shop floor, Qian et al. (2019) proposed a versatile architecture to perform 3D visualization on the shop floor Big Data collected through sensors and IoT technology in real time, with the aim of showing the real-time state of the shop floor comprehensively, thus enhancing production efficiency and reducing production costs.Aside from these, Kang et al. (2018) proposed methods for visualization and spatial-temporal analysis of Big Earth data based on Keyhole Markup Language (KML) and employing Cesium, which is a java library for creating 3D virtual Earth and 2D maps in a web browser and would help reveal the correlation between dimensions and periodic trends within the Big Earth Data.Finally, Alves et al. (2020) combined data mining techniques with data visualizations to enhance decisionmaking.They believed that this combination would enable the decision-makers to interact with the system and adjust the data analysis according to their needs.

New Research Agenda
Technology infrastructure: The area of research on this topic involves approaches for integrating technologies and making them operate together properly (Singh et al., 2019), which also raises the need for standardization and improved interoperability (Darwish et al., 2019).For example, there is a need to find approaches to integrate IoT, Industrial Information Integration Engineering (IIIE), 5G and Blockchain (Chen, 2020).Concerning IoT, a common consensus on the IoT standards, as well as a clear functional view of different components of IoT operating together are required (Singh et al., 2019).Additionally, within an IoT environment, interoperability between various parts such as networking, devices, syntax, semantics and platform should be enhanced through a protocol gateway service (Singh et al., 2019).In terms of big data, deploying programming languages such as Python and R in big data analytics should be explored (Munawar et al., 2020).Aside from these, fog computing was introduced in order to eliminate latency issues caused by transferring data between the cloud and the application in latency-sensitive applications such as healthcare applications.However, it also brought about other requirements that need to be addressed, like the business model, security, privacy and scalability (Fei et al., 2019).

Case examples:
Although a huge amount of research has been conducted in this field, there is a need to integrate technologies like IoT, Artificial Intelligence, etc. into different business areas in order to achieve a comprehensive transition from traditional to modern and smart.As far as buildings are concerned, more research should be carried out on the integration of Artificial Intelligence (AI) with Building Information Modelling (BIM) in order to make the BIM modelling process less time-consuming (Yin et al., 2019).Moreover, the use of Cloud BIM in off-site construction is required in order to enable the stakeholders to share the project data (Yin et al., 2019).Additionally, special attention must be paid to enhancing the exploitation of power consumption datasets by using machine learning algorithms that can contribute to reducing energy consumption and shifting to a more sustainable and energy-efficient environment, like using generative adversarial networks (GANs) to improve the quality of collected data, and deep learning models to identify consumption anomalies (Himeur et al., 2020).Also, in order to monitor energy consumption, deploying more cost-effective hardware to transmit and process data is important, as is the use of IoT sensors and smart meters to help achieve data accuracy and real-time data collection and analysis support.While end-users' power consumption data can be used to extract usage patterns, privacy must be protected.As for other areas, a promising research direction is the deployment of big data by means of drones, UAVs, and satellites to assist with the prediction of natural disasters like floods, bushfires, etc. in order to take necessary measures in advance (Munawar et al., 2020).Moreover, to monitor pollution, mobile crowdsensing can be used to collect the related big data (Zappatore et al., 2019).Finally, more emphasis should be placed on the hardware and algorithms required to employ Big Data on the whole lifecycle of a product (Li et al., 2015).
Final-user experience: Visualization is one of the technologies that has emerged to directly assist users in various areas and is considered a hot area of research (Eldin et al., 2020), although there are still challenges that need to be addressed.To name but a few, visualization techniques should be equipped with context awareness to be able to visualize the data according to the situation.Also, it is better if the reasons underlying the recommendations delivered by a visualization system are transparent to the users.Additionally, more efforts should be focused on integrating virtual reality techniques into visualization systems in order to make the system interactive by accomplishing the third dimension (Eldin et al., 2020).Aside from these, regarding energy consumption, effective visualization techniques can be employed to allow the end-users to be aware of their consumption behaviour (Himeur et al., 2020).Regarding the healthcare system, custom visualization and dashboard panels with more details would effectively help to enhance patient-physician interactions (Fedushko & Ustyianovych, 2020).
Big Data tools: Big Data technologies have facilitated business digitalization, although more efforts are necessary for many related areas.To name just a few instances, firstly, with the growth of heterogeneous big data, some challenges arise, including the storage overhead on the servers, and to overcome this challenge many scholars have suggested using distributed storage like cloud computing (Khare & Totaro, 2019;Roman et al., 2013;Zhong et al., 2016;Fei et al., 2019).Despite the benefits, it also gives rise to other challenges like the need to take care of data integrity, accountability, availability and authenticity (Karim et al., 2020), and above all security and privacy concerns (Roman et al., 2013).Other than that, the lack of efficient algorithms for querying big data in a cloud environment has led to scalability issues and delayed responses.The second promising research direction is to branch Big Data out into various business fields like the tourism industry to address the problem of over-tourism and to increase overall productivity (Mrsic et al., 2020) and more importantly in healthcare (De Mauro et al., 2019), such as helping to design predictive systems for early detection of the diseases (Darwish et al., 2019).The third research direction concerns the approaches to performing effective and thorough analyses and mining of collected data (Zhou et al., 2020), which can be achieved through various approaches, such as by using the power of machine learning (Zhou et al., 2020) and deep learning along with data fusion mechanisms (Himeur et al., 2020).Finally, an increasing amount of attention has recently been paid to the big data collected through mobile crowdsensing, although more research is required to reveal the power of crowdsourced smartphone-based measurements.Furthermore, obtaining consistent and reliable results from smartphone sensors is still a major concern (Alavi & Buttlar, 2019).

CONCLUSION
Although IoT data visualization for BI processes enhances the productivity and efficiency of the business, different authors have shown that there are numerous challenges and factors that must be identified and studied in this field.
To advance in this line of knowledge, in this paper a bibliographical analysis of the literature on IoT data visualization for BI processes published since 2009 has been carried out.A sample of 237 papers were analysed in order to identify the evolution over time of the number of articles included on the list, the evolution of the number of citations generated by these articles, the number of articles published by author, the number of articles published by country, the number of articles published by institution, the content of the 10 most cited articles on the list, the number of articles published per journal, the indicators of relevance, impact and prestige of the 10 journals with the most articles published on the list, and the established and emerging research categories on the topic as well as their future research necessities.
The work presented in this paper contributes to the literature on IoT data visualization for BI processes, as it extends the existing bibliographical reviews: (1) it considers IoT, data visualization and Business Intelligence together, since to date none of the existing reviews had considered these three research areas together; (2) it extends the period of the systematic review to 2020; (3) it has a greater coverage of information sources since it uses both the Scopus and Web of Science databases jointly; (4) it identifies the main authors, countries and institutions that contribute in the field of IoT data visualization for BI processes, using statistical analysis and bibliometric analysis techniques to obtain and compare the most influential works (response to RQ1); (5) through a topic modelling using LDA, it identifies and proposes four research categories: Technology infrastructure, Case examples, Final-user experience, and Big Data tools (response to RQ2); and (6) it identifies future research needs in the field of IoT data visualization for BI processes (response to RQ3).
The bibliographical analysis has confirmed the initial hypothesis that an analysis of current research could facilitate the advancement of future research in this field.The main conclusion is that the area of study requires more research and a higher number of annual publications.It is also necessary to improve the relevance of the research carried out, something that could be achieved by accessing journals of greater impact.Finally, the number of papers published in each category is quite balanced.The category with the fewest papers is Case Examples.Therefore, research conducted in this category should be improved, since it is crucial to transfer the knowledge generated by academics to practitioners, thereby allowing the implementation of the advances in the other categories in real enterprises.In this regard, it is notable that not all corporate functions that can take advantage of IoT data visualization for BI have case examples, such as marketing, purchasing, sales, etc.The same happened with the business sectors.This is an important gap in the current research on this category.
Finally, it is important to highlight the limitations of the study.This research was limited mainly by (1) the biases introduced by studying only two bibliographical databases: the Web of Science and Scopus.There was also a language bias, due to the fact that these databases include mostly articles that were written in English, and the search was conducted only in English.Other databases could be used to improve and compare the results; (2) choosing a series of specific keywords introduced another bias by default.
Other keywords could have been used and might have yielded different results; (3) the bibliometric analysis based on LDA was used.Other methods, such as network citation analysis, might be used for such an analysis; and finally, (4) the literature was classified in four research clusters.Other methods may result in other classifications.

Citations
Big Data in product lifecycle management (Li et al. 2015) The document reviews the different aspects of Big Data as well as product lifecycle management and answers questions about the feasibility and benefits of the application of Big Data techniques in manufacturing.
1 257 The evolution and future of manufacturing: A review (Esmaeilian et al. 2016) A survey of the elements in manufacturing systems and the state-of-the-art, as well as the trends in manufacturing, is presented in this document.
3  (Zhong et al. 2016) Using cloud manufacturing, the document presents a visualization approach to deal with an enormous amount of data that is collected from the RFID sensors on the shop floor.
1 113 The impact of the hybrid platform of internet of things and cloud computing on healthcare systems: opportunities, challenges, and open problems (Darwish et al. 2019) The document mentions that the combination of cloud computing and Internet of Things can bring about significant advantages in various areas within the healthcare system, such as smart hospitals, medicine control and remote medical services, and it reviews the current literature on this combination.0 75 Cognitive assisted living ambient system: a survey (Li et al. 2015) The document presents a review of the information and communication technologies that are used in the field of Ambient Assisted Living in assisting elderly people with their personal and social life.The document argues that external auditors should move towards the integration of Big 3

Category
Total Citations engagement: Research needs (Appelbaum et al. 2017) Data analytics in their profession, since client systems have been moving towards the technologies that lead to the production of Big Data, and then it reviews the concerns and opportunities in this regard.
A Cyber-physical System Architecture in Shop Floor for Intelligent Manufacturing (Liu & Jiang 2016) The document introduces a comprehensive cyber-physical system architecture for the shop floor which would assist in intelligent manufacturing.It covers configurational and operational delicacies from data collection and interconnection between the entities to industrial Big Data analysis through their proposed framework, leading to the acquisition of knowledge that would help intelligent decision-making.
3 41 The role of Information and Communication Technologies in healthcare: taxonomies, perspectives, and challenges (Aceto et al. 2018) The document studies the relation between ICT and healthcare and presents a holistic view of the application of ICT technologies in healthcare.
0 40 Table A4.Articles with the most citations

A5 Sources analysis
In the analysis of the sources, the weight of the three sources with the most publications must be highlighted: ACM International Conference Proceeding Series with 14 publications, followed by Advances in Intelligent Systems and Computing publications, and Procedia CIRP both with 11 (Table A5).Three impact indicators have been used to assess the relevance of the sources in question: CiteScore, Source Normalized Impact per Paper (SNIP), and SCImago Journal Rank (SJR).CiteScore measures the average number of citations received per document published in the journal.Values are calculated by counting citations over a year for documents published in the three years prior to the calculation and dividing by the number of documents published in those three years.The SNIP measures the impact of citations in a given context and is based on total citations per field of study.The impact of a citation has a greater value in fields where citations are less likely to occur.SJR takes into consideration the prestige of the source in which the article is published.It uses an algorithm similar to Google to establish rankings between websites.It also takes into account the citations of the article.The indicators reflected in Table A5 express the degree of impact, relevance and importance of the source, according to these indicators.

Health Human Resources All Corporate functions
1-(T2) In order to monitor the health condition of elderly or disabled people in smart homes, Lupión et al. (2020) developed an IoT platform with the help of sensors that send the data to the cloud.The data would then be analysed by means of activity recognition algorithms and finally the results would be visualized through a web-based system and necessary alerts or notifications would be issued for doctors or the user's family.
1- Mousannif et al. (2014) introduced the steps to build Big Data projects in any organization and reviewed the possible platforms and tools to be used in each step.
2-Fedushko & Ustyianovych (2020) developed a software package for automating healthcare functions through the implementation of a complex monitoring system and a data collection mechanism.Their application contained consolidated information and interactive dashboards to see all the information that would help in decision-making.
3- Ristevski et al. (2020) indicated that software platforms should be developed to deal with healthcare Big Data in which special attention must be paid to security and privacy for all the parties, especially the patients.Additionally, this software should be able to classify the analysed data by patients, population, epidemic, clinical symptoms and country to make decision-making more effective.2016) managed to deal with the huge amount of data obtained from a transportation system in a smart city and pointed out the challenges they face during their work, including noisy data, diverse data formats, data modelling and increasing demand for sophisticated visualization support.

Human
2- Li et al. (2017) introduced a cloud-based platform called City Digital Pulse (CDP), which is an end-to-end architecture covering all parts of Big Data analysis that consists of five major components: collection of data through soft sensors, storing the data in a cache database and a main database, analysing the data by algorithms, visualization of the data via a web service, and finally integration of all the components into the cloud.1-(T3) In order to produce an intelligent factory through the implementation of Industry 4.0, Shafiq et al. (2016) presented a conceptual all-inclusive framework that comprises four stages, namely, real-time data capture from sensors, PLCs, etc., data standardization and formalization, semantic analysis, and real-time visualization of key performance indicators (KPI) through a GUI-dashboard.

Telecommunications
Arora & Rani (2018) proposed a real-time data streaming method of analysis for fraud detection in the data gathered from telecommunication servers, using the Azure framework.

All
Mining Industry 1- Botes et al. (2019) identified four qualities, namely, establishing a focus area, data availability, analytics and visualization, that contributed to intelligent reporting, and applied them to a case study to evaluate the current intelligence of the reporting of the case study and areas for improvement based on data-driven decision-making.

Manufacturing
Human Resources Risk Management

Figure
Figure 1.Research methodology

Figure 3 .
Figure 3. Main results achieved in each category

3-
Komamizu et al. (2017)  presented a real-time analytical system based on an OLAP system for streaming data in order to analyse the smart city real-time data.4-Iliashenko et al. (2019) proposed an Intelligent Transport System that allowed for Big Data analysis and on-line decision-making in a transportation system.5-Rojaset al. (2020) presented a framework called Cities-Board based on model-driven engineering to automate the development of smart cities dashboards by means of a graphic domain-specific language (DSL).6-Gupta et al. (2020) proposed a smart parking system using IoT, cloud-based services and smart phones that enables payment systems along with real-time data analytics.

Table 2 .
Validated topic model Topic 1, Case examples Topic one, Case examples, comprises all the papers that have proposed solutions (ad-hoc or general) for real problems in different business sectors and corporate areas.Different business sectors have been considered: Energy, Health, Building/construction, Smart Cities & transportation, smart factories, Telecommunications, B2C companies/Business Shopping, Geography/Environment & Agriculture, Mining Industry, as well as other proposals suitable for enterprises in any business sector.On the other hand, the corporate functions that have been studied are: Strategic Planning.Manufacturing, Facility Management, Human Resources Management and Risk Management, and there are proposals suitable for any corporate function.Table B1 in appendix B synthesizes the Topic 1 proposals by business sector and corporate function.
Marques and Pitarma (2020)called MacroBase to work with fast data that classifies and explains fast data to the end-user with the help of a combination of streaming classification and data explanation techniques and would lead to an increase in performance.Moreover, in order to monitor the organoleptic properties of the food products in an Iberian ham manufacturing company, García-Esteban et al.Using cloud computing in data collection, as well as sensors and actuators, they managed to implement their proposal on a farm and decreased the number of workers.With regard to smart buildings, some authors contributed to the transition from regular buildings to smart buildings by employing visualization along with other technologies.Fraternali et al. (2018)developed a platform in which they visualized the data gathered from smart meters and sensors to display energy consumption patterns in a way that is understandable for a wide range of users and to give them adaptive recommendations regarding energy saving while encouraging them to collaborate.In their proposal, they also took advantage of gamification methods to promote awareness and make durable behaviour changes towards sustainable energy consumption.Additionally, Kazado et al. platform for the visualization of indoor environmental parameters.Furthermore,Ceccarini et al. (2020)took advantage of IoT and demonstrated a case study of a smart campus in which the heterogeneous data gathered from the sensors is visualized and contributes to a more efficient use of the campus premises.To improve current visualization techniques in a building that are deficient in interaction and immersiveness,Carneiro et al. (2019)presented an approach based on IoT by integrating augmented reality (AR) technologies and smart buildings, and provided effective interactive AR visualizations for the occupant to monitor the energy consumption and learn about the interconnection of the building system.As regards health issues,Lupión et al. (2020)andMarques and Pitarma (2020)worked on health-related concerns in smart homes.To monitor the health condition of elderly or disabled people in smart homes,Lupión et al.  ( Iftikhar et al. (2020)018)platform called ICatador based on cloud manufacturing which consists of four collaborative agents.In this proposal, they made use of advanced visualization techniques, communication technologies, and Artificial Intelligence techniques, mainly to facilitate quality testing for the professional taster.Furthermore, to facilitate monitoring of the condition of the machines by the engineers in the manufacturing industry,Olivotti and Eilers (2018)proposed a visualization technique that exploits sensor data, with the aim of detecting the reasons underlying anomalies, optimizing maintenance efforts, and increasing the availability of machines.Additionally,Iftikhar et al. (2020)used machine learning techniques to produce a solution for real-time analysis and dynamic visualization of the sensor and ERP data in smart manufacturing companies in order to detect possible faults in the future.Regarding smart farms,Bojan  et al. (2016)presented a generic architecture to deal with the visualization of large-scale time-series data that can be employed in various systems, if they work with structured data.(2019) mentioned that Building Information Modelling (BIM) is a virtual presentation of a building that displays the exchange, management and communication of data about the building, but it is unable to represent the real-time information related to the performance.Therefore, they introduced an add-in program to integrate BIM and real-time data collected from the existing building sensor technology, within three approaches, and developed a data Luo et al. (2019)19)sive architecture to be used in IoTbased smart factories to assist with fault detection and predictive maintenance.Several technologies such as Apache Spark, OPC Collector, transformation protocols, and encryption methods have been used to produce this manufacturing Big Data ecosystem.Unlike many other papers addressing the same subject, they paid attention to data security issues and guaranteed data security.To accomplish an intelligent factory through the the entities, as well as visualization of complex traffic problems.The authors applied their proposal to a real city to approve its efficiency and feasibility.Furthermore, in order to achieve smart transportation management,Khan et al. (2020)presented an architecture capable of integrating heterogeneous dynamic Big Data gathered from various sources in urban transportation systems.The authors employed data mining and machine learning models and covered all the necessary steps to deal with Big Data, that is, data acquisition, storage, analysis and visualization, together with realtime monitoring and forecasting, with the aim of assisting decision-makers.Aside from these, concerning marine transportations,Soares et al. (2019)proposed an agile and comprehensive framework that acquires heterogeneous data streams from various sources such as maritime and marine sensors within IoT infrastructures, integrates and processes the data employing Semantic Web Technologies, and discovers knowledge to help the ships and reveals unusual events.Finally,Luo et al. (2019)presented a comprehensive Big Data analysis framework that covers collection of data, processing the data and data mining, and data visualization, by integrating Big Data analytics into cyber-physical systems (CPS) to help decision-makers in various areas.They proved its practicality and versatility through two practical cases in power grids and aircraft.The second research line is mostly concerned with the application of visualization to Big Data collected from various sources.To deal with industrial Big Data,Redondo et al.
Zhang et al. (2019)ate and share Big Data collected from satellites and helicopters, aka Space-Air-Ground Big Data, with the aim of accomplishing accurate and energy-efficient transportation in real time.Zhang et al. (2019)introduced a Distributed Collaborative Urban Traffic Big Data (DCUTBD) system which takes advantage of cloud computing to provide a collaborative platform to share multidimensional traffic data, software and resources among (2020) aimed to handle tremendous amounts of data in industrial companies to support the continuous monitoring of machines and extract knowledge and patterns through a visualization technique called Hybrid Unsupervised Exploratory Plots (HUEPs), which combines Exploratory Projection Pursuit (EPP) and Clustering methods.They applied their proposal to the practical case of an automotive industry sector to test its ability to predict failures.Xu et al. (

Table A5 .
Ten sources with the most published articles and their impact indicators.Source: SNIP: Source Normalized Impact per Paper.SJR: SCImago Journal Rank DeRegt et al. (2020)developed a Virtual Reality value chain that makes it clear whether the use of Virtual Reality technologies adds value for key stakeholders.