The link between quality management and innovation performance: a content analysis of survey-based research

Previous contributions have reported contradictory findings about the effects of quality management (QM) on innovation performance. The purpose of this paper is to critically appraise methodological issues in the literature concerning the link between QM and innovation performance, in order to reveal possible differences in the research design which may explain the heterogeneity of the findings. Through a content analysis of peer-reviewed journal articles on this topic from 1994 to 2016, the authors compare the research designs used, and identify the most prevalent practices in conducting survey-based research. In addition, cross-tabulations are used to analyse the relationships between methodological issues and type of findings. The main findings can be summarised as follows: some papers report incomplete information about methodological issues; they focus on the organisational level of analysis and see higher managers as common informants; there is a lack of research which combines surveys with other methods, as well as of longitudinal designs; and the findings obtained may be conditioned by the way QM and innovation performance were measured. The characteristics revealed in this analysis provide a platform to assist scholars in developing future stances in this and similar fields of research.


Introduction
One important area of research in quality management (QM) has been the examination of the extent to which QM practices have an impact on performance (Nair, 2006). Since innovation is a key foundation for sustainable competitive advantage, over recent years many studies have attempted to shed light on the relationship between QM and innovation performance. Some studies (Hoang, Igel, & Laosirihongthong, 2006;Martínez-Costa & Martínez-Lorente, 2008), however, show that there is no consensus about the potential effects of QM on innovation performance. In this vein, Kim, Kumar, and Kumar (2012) and Prajogo and Sohal (2001) present a comprehensive analysis of the arguments and findings in previous contributions that support both a positive and a negative relationship. Kaynak (2003) states that the failure to obtain consistent results about the QM -performance relationship could be due to significant differences in aspects of the research design, such as the operationalisation of QM (single construct vs. multidimensional construct), the kind of performance measured, or the analytical framework used, which do not always allow direct and indirect effects to be identified. Consequently, in the presence of conflicting results, and after more than 20 years of research on the subject, the purpose of this study is to perform a critical appraisal of the methodological questions that arise in the studies on the relationship between QM and innovation performance that use a surveybased research. More specifically we are referring to the most recurrent issues that imply a problem for the stability and reliability of the research.
Some scholars have reported literature reviews of previous contributions on the QMinnovation relation (e.g. Manders, de Vries, & Blind, 2016;Riillo, 2014). However, they do not focus on the study of the methodology issues and as a result there is no structured review that analyses the research design used in past publications, which could guide future research. Hence, it is important to fill this gap and to pay particularly careful attention to the research design employed in previous contributions. This becomes even more relevant considering that most of the papers on the subject use survey designs with questionnaires to collect data from organisations, which is also the most frequently used method in empirical research in the broader field of production and operations management (POM) (Rungtusanatham, Choi, Hollingworth, Wu, & Forza, 2003;Taylor & Taylor, 2009).
The interest of a review of research design issues in previous contributions is justified since some researchers (e.g. Forza, 2002;Malhotra & Grover, 1998), talking about survey research design, state that effective contributions to theory development can only be made if the survey methodology is implemented carefully. These authors suggest that it is useful to review the research practices followed by scholars in conducting their own surveybased research in order to evaluate the methodological rigour of existing survey-based research. For example, Forza (2002) highlights that many articles do not provide sufficient information on how their sample was constructed. As Flynn, Sakakibara, Schroeder, Bates, and Flynn (1990) note, researchers must become more critical readers of the empirical research done by others and also by themselves. Similarly, Taylor and Taylor (2009) remind us that 'reflecting on where we have come from and taking stock of where we currently are in a discipline arguably paves the way for the challenging task of identifying future directions and trends'. Having a wide knowledge of the methodology used in past research would allow a more careful assessment of the true effects of QM on innovation performance and a more comparable analysis of previous findings. Moreover, it would entail a better control and confidence over the decisions to be taken in the research design, prevent problems, and ensure the rigour of the research process when planning new studies.
Hence, the aim of this paper is twofold. The first research objective is to analyse what survey research designs, in terms of sample characteristics, survey administration and data collection issues, measurement of variables or statistical techniques, have been used to study the relationship between QM and innovation performance. The second objective is to explore whether there is any pattern in the research methodology that may influence the diversity of findings on the relationship between QM and innovation performance.
We use content analysis as a systematic technique for reviewing the literature to unveil the extent to which differences in the design of survey-based studies analysing the QMinnovation relationship might explain the heterogeneity of the findings. A content analysis of empirical studies on the QM -innovation performance relationship is warranted and contributes to the literature in several ways. First, it compiles the methodological issues in empirical research on this topic, thus generating reflection and critical analysis about the state of the art. Second, the review of suggestions and the status of the methodologies used will provide researchers with a useful guide for tackling, and ultimately enhancing, research design issues. Third, the output from the content analysis will provide a platform to interpret conflicting findings on the topic under study.
Following this introduction, a review of the controversy regarding the link between QM and innovation performance is reported. After that, the research method followed in the content analysis is presented. We then discuss the findings from the analysis. Finally, we report on some research opportunities derived from the discussion, and present some conclusions and limitations of the research.

The relationship between QM and innovation performance
The literature reports a wide variety of findings on the relationship between QM and innovation performance. As Moreno-Luzon, Gil-Marques, and Valls-Pasola (2013) state, '[while] support for a positive relationship is stronger than for a negative one, conclusive results are yet to appear'. Some papers have found a positive relationship (e.g. Hung, Lien, Fang, & McLean, 2010;Martínez-Costa & Martínez-Lorente, 2008;Prajogo & Hong, 2008). According to this stream of literature, QM could foster innovation because it nurtures a fertile environment and culture that supports innovation by enabling the efficient detection of customer needs, promoting knowledge sharing, training, commitment and participation, and the continuous improvement of work systems (Martínez-Costa & Martínez-Lorente, 2008). As Martínez-Costa and Martínez-Lorente (2008) point out, QM practices are in accordance with the aspects that Pfeifer, Siegler, and Varnhagen (1998) claim are fundamental for innovation: customer orientation, promotion of flexible organisational structures, and fostering autonomy and creativity in employees. In the same line, Song and Su (2015) highlight that the enablers of innovation are essentially the same elements as those that characterise QM, such as teamwork, employee involvement, and supplier participation.
However, some other scholars did not found a significant effect (e.g. Singh & Smith, 2004). These papers argue that the standardisation associated with the management of processes can lead to linear thinking and generate only incremental innovation or reduce it to the needs of current customers (Prajogo & Sohal, 2004;Singh & Smith, 2004). Some other scholars even note the possibility of a negative link between process management and innovation (e.g. Benner & Tushman, 2003). As QM promotes the reduction of process variation, and many ideas related to innovation result from variation in organisational processes (Song & Su, 2015), this reduction in variation may actually lead to a reduction in innovation.
Another stream of literature exhibited mixed results depending on the QM practices taken into account (see Table 1). The majority of these papers analyse the QM-innovation relationship considering multiple QM practices, which embrace both soft (e.g. leadership, human resource management practices) and hard elements (e.g. process management, measurement systems).
The analysis of the papers in Table 1 shows that there are no identical effects of all QM practices on innovation. Some of these studies emphasise that soft QM practices are critical to achieve full innovation advantages from QM practices. In this regard, Prajogo and Sohal (2004) found that organic and mechanistic practices coexist under the umbrella of QM, but each practice plays a different role in determining performance -leadership and people management being the ones most closely related to product innovation. In a similar vein, Song and Su (2015) found that practices associated with customer focus and human resource management promote new ideas from customers and employees, and allow employees to learn and have the necessary autonomy to use new techniques to develop new ideas.
However, other papers (e.g. Silva, Gomes, Filipe Lages, & Lopes Pereira, 2014;Zeng, Phan, & Matsui, 2015) did not report a direct relationship between soft QM practices and Total Quality Management innovation. In contrast, they point to technical practices as being critical for product innovation, while a QM culture, teamwork, empowerment, and training are necessary as supporting practices. Hence, these authors suggest that soft QM practices are instrumental to enable more technical QM practices to have an effect on innovation. Kim et al. (2012) reach a similar conclusion on determining the dominant role of process management to improve innovation performance when supported by a set of interrelated soft and hard QM practices. Finally, there are some papers (e.g. Hoang et al., 2006) which report that, although not all QM practices are related to innovation, both the mechanistic and organic components of QM could support the firm's innovation. All in all, the above analysis evidences the different ways QM and innovation performance have been considered in previous research and the fact that no consensus exists regarding the practices that drive innovation performance within an organisational context defined by a QM initiative. This evidence leads to our research objective regarding the extent to which the survey research design may influence the diversity of findings.

Research methodology
In recent years, content analysis has been used by scholars to perform a systematic review of the literature in order to identify publication trends in several disciplines (Chatha, Butt, & Tariq, 2015). Inspired by the Weber (1990) protocol and taking into account the commonalities in the methodology of content analysis used in previous papers (e.g. Chatha et al., 2015;Duriau, Reger, & Pfarrer, 2007;Gallardo-Gallardo, Nijs, Dries, & Gallo, 2015), we followed the basic phases of data collection, coding, analysis of content, and interpretation of results.

Selection of articles
First, a thorough search was performed to identify papers on QM and innovation, and to compile all the relevant literature that linked the two topics. The search was conducted in four databases (ABI/Inform, Business Source Premier, Scopus, and Web of Knowledge) in May 2016, and was restricted to academic articles that mentioned the terms 'innovation' and 'quality management' (or 'TQM') in their title, abstract or keywords. Only articles in scholarly (peer-reviewed) journals were selected. The researchers read the abstracts from the selected papers, and only studies that reported empirical research on QM and innovation performance using the survey methodology were included in the analysis. We also followed other authors (e.g. Duriau et al., 2007;Shi & Yu, 2013) by checking the reference section of the papers from the search to reveal additional studies. The first study about QM and innovation using a survey methodology was published by Flynn in (1994). We therefore selected this year as the starting point of the content-based analysis. The search generated 47 peer-reviewed articles 2 in 33 different journals over the period from 1994 to 2016, which met the inclusion criteria for the content analysis.

Coding
Second, we developed a coding manual, where coding categories were defined based on the most critical issues considered in the survey research methodology, following Forza (2002) and Malhotra and Grover (1998) (see Table 2). Each category was coded according to the information provided in the papers. For instance, for the category 'Kind of relationship between QM and innovation', the following values were assigned: positive in all  (2); mixed results depending on the QM practices (3); mixed results depending on the kind of innovation performance (4); mixed results depending on the QM practices and kind of innovation performance (5); no significant relationship was found (6). Codes were refined after a pilot study to test the coding manual on a sample of 10 selected articles.
Although there is no consensus about the procedure to be used to establish coding reliability (Duriau et al., 2007), it was established through the use of multiple coders. Following the approach used by Gallardo-Gallardo et al. (2015), the 47 articles were divided between the three authors for coding. Each author summarised information about the selected categories for each article in a previously prepared chart, and coded them according to the coding manual. The authors also compared coding experiences during the process in order to discuss ambiguities or discrepancies. As a result, some papers were discussed together to reach a joint decision. This procedure avoids the risk of coder bias.

Findings
The content analysis was performed by means of frequency counts and cross-tabulations, as is usual in previous content analyses on management research methodology (Duriau et al., 2007). This section reports and interprets the findings and the trends from the content analysis divided into the main categories considered: (1) descriptive data of the sample articles, (2) methodological issues, and (3)  Identifying the leading contributors in the QM -innovation performance relationship is useful in order to better understand and replicate their results. As depicted in Figure 1, in our list of 47 articles, Prajogo is a co-author in 5 of them. It is also of note that, according to Scopus, Prajogo and Sohal (2003) is the most cited article with 185 citations (December 2016).

Journals and academic fields
The 47 articles in our database appeared in 33 different journals, indicating that the study of the QM and innovation relationship is an appropriate subject for a wide range of journals. Specifically, only three journals published four or more articles on this topic: Total Quality Management and Business Excellence (five articles), Technovation (five articles), and International Journal of Quality & Reliability Management (four articles). Following Shi and Yu (2013), we classified the journals according to their academic field (see Table 3).

Country and regional representation
According to Zeng et al. (2015), studies about QM and its influence on innovation are often restricted to a specific region (e.g. Australia, Spain, Singapore). These authors state that using a multi-country sample helps generalise the relationship between these two concepts. Regarding the countries in which data were collected, our analysis shows that Spain was the most prevalent (e.g. Perdomo-Ortiz, González-Benito, & Galende, 2009) (27.66% of all the articles), followed by Portugal (12.77%), Australia (8.51%) and cases in which two countries had been studied together (8.51%). Studies based on data from more than two countries represent 6.38% of our sample. With regard to regional representation (see Table 3), it is striking that most research on the QM -innovation relationship is based on European contexts.   Table 3 shows the kind of industries in which the reviewed studies were based. Particularly, we highlight three predominant features: (a) 59.57% of the articles focused on the manufacturing sector (taking into account identified/unidentified and one/several sectors) in comparison with 4.26% of the articles centring on the service industry; (b) only 14.9% of the articles reviewed analysed high-tech industries or innovative fields (e.g. Hung et al., 2010;Perdomo-Ortiz et al., 2009); and (c) nearly 32% of the articles were based on samples with unidentified industries. Another criterion taken into account to define the sample was the companies' quality profile. This criterion could be relevant since, for instance, the implementation of ISO 9001 may represent substantial changes in the organisations, leading to different innovation performance (Manders et al., 2016). In our analysis, 44.68% of the articles were based on samples in which QM has been implemented (e.g. Kim et al., 2012), with ISO certified companies being the most prevalent. The rest of the articles studied both types of firm (QM and non-QM) (17.02%), or they did not report any information about this characteristic (38.30%).
Size of organisations. Regarding the size of the organisations, most of the articles included companies of all sizes (51.06%) (e.g. Prajogo & Sohal, 2003, 2006Song & Su, 2015). The remaining studies used small and medium (17.02%), medium and large (10.64%), and only large firms (with more than 250 employees) (2.13%), although 19.15% did not report on the companies' size. Notably, none of the studies used a sample made up only of small firms. As Riillo (2014) states, previous studies on the QM -innovation relationship have mainly analysed large firms.

Survey administration and data collection
Informants. Some authors recommend using multiple informants (Forza, 2002;Malhotra & Grover, 1998) to guarantee greater methodological rigour. However, this raises the probability of receiving fewer completed questionnaires, which can affect the results (Forza, 2002), as well as the cost and time involved in obtaining multiple responses from the same organisation (Malhotra & Grover, 1998). Of our 47 articles, only 3 had more than 1 informant, 78.7% (37 articles) had 1 informant, and the remaining articles did not report this information. When there was just one informant, the most frequent position was the general manager (31.91%). Another key respondent was the quality manager (in 10.64% of the articles reviewed). Another group of articles (27% of the studies reviewed) were based on responses from one respondent in different positions.
Data collection methods. In survey research, the main methods used to collect data are interviews and questionnaires (Forza, 2002). We found that the most common method (nearly 60% of the sample) of obtaining data is via email and postal mail. Three other methods, face-to-face interview, secondary data, and mixed procedures, each accounted for 8.51%. The least common data collection method was the telephone survey (2.13%).
Sampling method. Random sampling was the most frequently used (42.55%) (e.g. Prajogo & Sohal, 2003, 2006, followed by studies in which the scope was the whole population (27.66%) (e.g. Perdomo-Ortiz et al., 2009). Notably, in 14.89% of the articles the sampling procedure was not specified. Convenience sampling (e.g. Hoang et al., 2006) or other procedures in which some organisations were excluded if they did not meet specific criteria, such as organisation age or number of employees (e.g. Prajogo & Hong, 2008), represented non-random methods in some studies (14.89%).
Response rate and non-response bias. Response rate (RR) is an important indicator of the success and validity of survey research (Frohlich, 2002;Mellahi & Harris, 2016), and is a factor that peer reviewers take into account because when RRs are low there is a danger that the data collected only represent the best and most successful companies (Frohlich, 2002). RRs under 20% are highly undesirable.
The mean RR found in our review was 37.72%, without taking into account six studies (12.8% of all the papers) in which these data were not specified. More than half of the studies (51.06%) reported a RR below 40%. Only 17.03% of the studies obtained a RR over 60%. Looking at the years in which they were published seems to suggest that the problem of low RR continues. Of the 21 articles published in the last 6 years, 14 have an RR below 40%, which does not suggest any hint of a positive trend.
A cross-tabulation was performed between the RR and the data collection techniques to gauge the efficacy of these techniques for enhancing RR. Table 4 shows that studies based on secondary databases and those which used mixed procedures reported the highest RR, since 50% of secondary database studies obtained an RR between 80% and 100%. Studies which used mixed procedures had an RR between 40% and 80%. Almost 70% of the studies that obtained data through an online questionnaire reached an RR between 0% and 20%. Given these observed frequencies, and in order to go deeper into this cross-tabulation, some statistical options were required. Thus, Fisher's exact test 1 was used, revealing a significant association between RR and data collection tool (p ¼ .005), which means that the pattern of RR in the diverse data collection tools is significantly different. Cramer's statistic was produced to test the strength of the association (Field, 2009). Its value (0.486) represents a medium -high association between these two variables. Finally, adjusted standardised residuals allow a more accurate interpretation of the meaning of the association (Field, 2009). As can be inferred from the table, the association between RR and the data collection tool is mainly driven by secondary databases (Adj. std. resid. ¼ 4.7), which produces RR of 80 -100%, and the use of mixed procedures, leading to RR of 40-60% (Adj. std. resid. ¼ 3).
Non-response bias arises as a problem when RRs are low (Boyd & Westfall, 1955;Frohlich, 2002). In some cases, the sample could not be representative since the respondents differed greatly from non-respondents, and therefore the results of the research were not generalisable (Boyd & Westfall, 1955;Flynn et al., 1990;Forza, 2002).
Our results show that the question of non-response bias is not dealt with in 63.8% of the articles reviewed. Only 31.9% of them report treating this bias in some way, either by comparing the responses of early and late respondents (e.g. Kim et al., 2012;Song & Su, 2015) or by comparing objective information such as annual sales (e.g. Perdomo-Ortiz et al., 2009) or billing and number of employees (Ruiz-Moreno, Haro-Domínguez, Tamayo-Torres, & Ortega-Egea, 2016) from respondents and non-respondents.
Common method bias. Common method variance can occur when data for independent and dependent variables have been collected using the same method or have been provided by the same source (Rungtusanatham et al., 2003). Other sources of common method biases arise from the measurement items themselves, the context of the items within the measurement instrument, and/or the context in which the measures are obtained (Podsakoff, MacKenzie, Lee, & Podsakoff, 2003). Of the 47 articles reviewed, only 7 (14.89%) evaluated the potential for common method bias. Most of these studies conducted Harman's one-factor test to detect the presence of common method bias. We found results similar to those of Rungtusanatham et al. (2003), who, based on a review of 285 survey research articles in operations management, identified only a small percentage (19%) of articles that took this kind of bias into account.

Measurement of variables and statistical techniques
Measurement of variables. The articles reviewed used a variety of measurements for both QM and innovation performance. Most of the studies were based on multi-item Likert scales (70.21%), predominantly 5-point Likert scales (46.81%). QM measurements were classified into six different categories according to the QM practices studied (see Table 4), with the most common being the subjective scale with one second-order factor. We also found great variations in the way innovation performance is measured. Table 5 shows that subjective measurements based on multi-item scales are the most common method (72.34%). Almost two-thirds of the studies examine results derived from product and process innovation (36.17%) or from product innovation (27.66%).
Statistical techniques. SEM is the most widely used method for data analysis (46.81%), followed by regression analysis (21.28%). The remaining techniques show similar percentages of use. The trend in the use of statistical techniques over time shows the prevalence of SEM from 2000 onwards, which may be due to the general trend in using this kind of technique in quantitative analysis. The use of regression and correlations seems to be decreasing. In recent years, ANOVA, MANOVA, and MANCOVA, and PLS appear as frequent options in this kind of study, although far behind SEM.

Findings on the relationship between QM and innovation performance
To tackle our second research objective, we looked at the different types of findings on the QM -innovation performance relationship, and classified the articles into groups according to the type of results reported: 38.2% of the studies reviewed (18 articles) reported positive results, while only 2% reported negative results, 12% found no relationship, and the remaining articles presented mixed results depending on QM practices and type of innovation.
We found some similar characteristics only in the group of studies with positive results, which we took as the referent group to compare with the remaining articles. Descriptive analysis based on frequencies indicates that 11 of the 18 studies with positive results share the following features: (a) measurement of QM with a multi-item scale, considering QM as a second-order factor; (b) measurement of innovation with a subjective scale and focused on product innovation, or product and process innovation; and (c) SEM as the statistical procedure.
Furthermore, in order to go deeper into the features of the groups, several crosstabulations were performed, crossing the type of results (positive vs. other) and the methodological issues previously analysed (informant, RR, etc.). Tables 5 and 6 present cross-tabulations between the categories that reported significant associations.
Firstly, Table 6 shows a cross-tabulation between results of the QM-innovation performance relationship and QM measurement. The p-value (.035) of Fisher's exact test demonstrated a significant association between the QM measurement and the findings obtained in the studies. More specifically, Cramer's statistic (0.494) represents a medium association between the two variables. As can be seen, when a subjective scale with one second-order factor is used, positive results in the QM -innovation relationship are more prevalent (66.7%). In contrast, 81.8% of the studies that used a subjective scale with different QM dimensions found a relationship other than positive. The adjusted standardised residuals (Adj. std. resid. ¼ 2.7 and 22.7) confirm that the measurement of QM as a subjective scale with one second-order factor is a feature of the studies with positive results.
A second cross-tabulation is presented (Table 7), in which the results for the QMinnovation relationship are crossed with the innovation type. The percentages show that 69.2% of the articles studying product innovation as the innovation type report positive results. In contrast, when different kinds of innovation are analysed, 80% report nonpositive results in the QM -innovation relationship. Fisher's exact test confirms a significant association between these two factors (p ¼ .028), whose strength (from Cramer's V) is medium (0.507). The residuals (Adj. std. resid. ¼ 2.7 and 22.7) reveal that measuring product innovation seems to be related to finding a positive QM -innovation relationship.

Research opportunities
Although several papers have summarised previous contributions on the relationship between QM and innovation outcomes (e.g. Riillo, 2014), to the best of our knowledge, this is the first attempt to provide an in-depth analysis of the methodology used in such contributions, and its potential relationship with the findings obtained. The findings from the content analysis have uncovered some areas for improvement in the form of research opportunities, which we summarise below. They represent contributions in the form of usable information for researchers conducting survey research.

Incomplete information about methodological issues
Coinciding with Forza (2002), we also found that many articles do not offer an adequate description of how their sample was constructed, and fail to provide sufficient information on the resulting sample and other information about the methodology used. For instance, more than 50% of the papers do not report the year in which the fieldwork took place, around 20% provide no information about the size of the sample organisations, and about 15% offer no information on the sampling method. Similarly, the issue of non-response bias is not covered in more than 60% of the articles reviewed, and the potential for common method bias is only evaluated by 15%. Consequently, there is an obvious need for more careful reporting of this kind of information, which is necessary to interpret the results and make comparisons among studies.

Specific sample profile
Some studies select a sample of organisations that are implementing QM as a way to guarantee certain interest in QM and familiarity with the topics of the research (e.g. Kim et al., 2012), while other researchers draw on a wide sample of multisector (mainly manufacturing) organisations to increase observed variance and make the conclusions more generalisable (Silva et al., 2014). We coincide with Riillo's (2014) conclusion that service sectors, data-sharing among researchers, replication of studies, and cross-country comparisons are not usual practices in this field, although they are interesting areas for future research.
Other future lines of research should cover specific sectors such as high-tech companies, where innovation behaviour is paramount, but which represent only 15% of the studies analysed.

Level of analysis and informants
There is a tendency to use a single level of analysis, usually the organisational level. As in other papers in different POM disciplines (e.g. Chatha et al., 2015), our study concludes that researchers usually capture the opinion of higher managers, and few studies consult lower management and employees on their perspectives, which again opens up an avenue for future research. Moreover, hardly any previous studies have focused on a specific department or area. As Prajogo and Hong (2008) caution, analysis of R&D departments provides a narrow picture of QM implications in specific areas primarily responsible for innovation. In addition, our findings show that most of the articles reviewed are based on information provided by a single informant (only three articles had more than one informant). However, methodological articles recommend using multiple respondents since more accurate information is gained than from a single respondent (Forza, 2002). Moreover, to avoid common method bias, it would be advisable to have at least two informants, one working in the area of quality and another in R&D. In sum, researchers should use more than one informant, an expert in the required information and, if there is only one informant, other data to triangulate the information obtained from an informant, that is, through secondary data or interviews.

Data collection methods
Interviews and questionnaires are the most widely used data collection methods in survey research (Forza, 2002). We found that the most common data-gathering tool (in nearly 60% of the sample) is email and postal mail, which tallies with the research by Chatha et al. (2015) in the field of manufacturing strategy. A combination of data collection tools may be a useful method of data triangulation. Online questionnaires supplemented by email or telephone contact are recommended to keep down data collection costs. Nevertheless, only 8.51% of the articles reviewed use mixed procedures. Researchers  should carefully plan how to carry out the survey research at this point, and decide what data collection tools to use in different stages of the research. This decision should form part of the protocol to be followed in administering questionnaires.

Response rate
Around 50% is a desirable RR in survey processes and a minimum of 20% has been suggested (Flynn et al., 1990;Frohlich, 2002). Some techniques such as sponsorship, incentives like offering a report with the results, anonymity, pre-notice or multiple-mailings and follow-up reminders can enhance participation (Dillman, Smyth, & Christian, 2009;Forza, 2002;Frohlich, 2002;Mellahi & Harris, 2016). Above all, the basic issue is to carefully plan the survey (Flynn et al., 1990) and to use suitable field methods (Boyd & Westfall, 1955) to achieve a high RR. However, our results indicate that higher RRs are only significantly associated with using secondary databases and mixed procedures as data collection methods.

Multiple research methods
Regarding the use of multiple research methods, our findings showed that only a limited number of papers combine survey methodology with other methods, and seem to tally with the suggestion by Taylor and Taylor (2009) and Chatha et al. (2015) to use mixed methods to provide some triangulation in POM research. The combination of quantitative and qualitative design has the potential to study the reality from a more holistic point of view and would make conclusions generalisable. Further, the use of multiple respondents for the same question, the use of multiple measurement methods (e.g. interviews, objective measures) or multiple methods to assess the variables of interest could be used as forms of triangulation or mixed methods to enhance the quality of the results (Malhotra & Grover, 1998;Forza, 2002;Gupta, Verma, & Victorino, 2006). All these actions could be addressed to reduce the common method bias (Rungtusanatham et al., 2003). Despite the benefits of using mixed methods, the researcher should assess the cost in terms of time and effort because their use may not be practical in some cases.

Measurement of variables
Analysis of previous studies revealed a variety of different measurements of both QM and innovation performance, which would contribute to mixed results, as suggested by Kaynak (2003). Multi-item scales are most commonly used but in some cases authors analyse QM at the dimension level, or at the construct level, considering QM as a second-order factor, which precludes studying the effect from specific practices. With regard to innovation performance, a high variety of measures are used, ranging from multi-item scales to measure product and process innovation, to objective measures based on number of new products or R&D expenditure. Since our research evidences that the measurement of variables may condition the results obtained, when interpreting the findings from previous contributions researchers should take into account the way QM and innovation performance have been measured in order to analyse and compare studies accurately.
It is also advisable to replicate previously used measures in order to make studies comparable. Likewise, it would be useful to conduct some kind of sensibility analysis measuring QM from different perspectives (see, for instance, Hoang et al., 2006).

Longitudinal studies
A prime feature in our sample of articles is the cross-sectional nature of the fieldwork. The predominance of cross-sectional studies is also paramount in other disciplines. Chatha et al. (2015), for instance, report that only 4% of studies analysed were longitudinal studies and point to the economics of this kind of study as a possible reason for their scarcity. We agree with Riillo (2014) that future panel data analysis would enable causality to be measured and would help to reconcile mixed results in the literature.

Contingency variables
Few studies use mediating and moderating variables. As Sousa and Voss (2008) suggest, the contradictory findings about the QM -performance relationship may be due to these practices being context-dependent. These authors advise that in mature operation scenarios, management practices such as QM should shift their focus from justifying the value of these practices to understanding the contextual conditions under which they are effective. Hence the use of contextual variables such as organisational structure, environment or organisational climate could be considered in future research. Moreover, in a similar line, Pierce and Aguinis (2013) note that when opposing theoretical proposals forecast relationships where two variables have different signs, and when empirical studies produce mixed results, it is necessary to take into account the possibility that the relationship between these two variables might be curvilinear (i.e. nonlinear).

Conclusions and limitations
With regard to the first objective posed in the introduction, this paper synthesises a thorough content analysis-based review of the literature exploring the developments in specific survey research issues used in empirical research on the QM -innovation performance relationship. One initial conclusion allows us to confirm the leading contributions from Prajogo and his colleagues, as well as the interest in this topic in the European context and in operations management and general journals. There was a prevalence of studies in the manufacturing sector that used random sampling, and QM-driven organisations were the target sample. Some recurrent methodological issues can be summarised as follows: research is mainly based on one single informant, either a top manager, quality manager, or managers in different positions; sponsorship is not widespread; email and postal mail questionnaires prevail, with an increase in the use of online questionnaires; RRs could be improved, as about half of the studies report a RR below 40%; a high percentage of papers did not report any analysis of how non-response bias or common method bias were dealt with. We also found considerable heterogeneity in the way QM and innovation performance were measured. Finally, SEM and regression were the most commonly observed statistical techniques.
With regard to the second research objective, a prevailing pattern emerged of papers reporting a positive relationship between QM and innovation performance, which can be characterised by the way QM and innovation performance are measured. Hence, researchers should take into account the way both variables are measured when interpreting findings from previous contributions, and should decide very carefully how to measure the variables.
These conclusions suggest some practical implications for researchers to ensure generalisability and replicability of the results. First, information should be reported about how the sample was designed and the resulting sample (e.g. the year of fieldwork, the size of the sample organisations, sampling method), together with methodological issues (e.g. pretesting the questionnaire, detailing how the data were collected, using multiple informants, analysing non-response bias). In the design of research models, it is suggested that future research lines incorporate a dynamic perspective through the longitudinal analysis of QM practices and their influence on innovation performance.
A multilevel approach, with information from different levels of analysis (employees, organisations, sectors, and countries), could be another opportunity for future research. It supposes a methodological challenge that could affect results and implications of the relationships between QM and innovation. As pointed out by Hitt, Beamish, Jackson, and Mathieu (2007), although most management problems are related to multilevel phenomena, the single level of analysis is still the most commonly used. In order to understand more complex relationships, it becomes essential to use a multilevel perspective, as it makes it possible to better scrutinise the richness of the consequences of management behaviours for individuals, groups, and organisations. At the same time it can help such a field to advance.
Some limitations need to be acknowledged. First, this paper has focused on studies using the survey methodology; the analysis of studies that use other methodologies, such as case studies, would improve our understanding of the relationship between QM and innovation performance. Second, we are aware that, by focusing on methodological issues, we have not analysed the particular theoretical frameworks used in the selected papers, nor have we made a theoretical analysis of the main drivers of innovation performance from a QM initiative. Finally, further investigation might extend our knowledge on some of the issues raised in our findings, such as why some countries seem to publish more research on this topic. All these limitations open up avenues for future research.
Notes 1. Fisher's exact test was used due to the small sample size and the fact that more than 20% of the cells have expected frequencies below 5 (Field, 2009). 2. The full list of the 47 articles included in the content analysis can be obtained from the authors upon request.

Disclosure statement
No potential conflict of interest was reported by the authors.