Qualitative interviews how many




















Critiquing the concept of saturation, Nelson [ 19 ] proposes five conceptual depth criteria in grounded theory projects to assess the robustness of the developing theory: a theoretical concepts should be supported by a wide range of evidence drawn from the data; b be demonstrably part of a network of inter-connected concepts; c demonstrate subtlety; d resonate with existing literature; and e can be successfully submitted to tests of external validity.

Other work has sought to examine practices of sample size reporting and sufficiency assessment across a range of disciplinary fields and research domains, from nutrition [ 34 ] and health education [ 32 ], to education and the health sciences [ 22 , 27 ], information systems [ 30 ], organisation and workplace studies [ 33 ], human computer interaction [ 21 ], and accounting studies [ 24 ].

Others investigated PhD qualitative studies [ 31 ] and grounded theory studies [ 35 ]. Incomplete and imprecise sample size reporting is commonly pinpointed by these investigations whilst assessment and justifications of sample size sufficiency are even more sporadic. Sobal [ 34 ] examined the sample size of qualitative studies published in the Journal of Nutrition Education over a period of 30 years.

A minority of articles discussed how sample-related limitations with the latter most often concerning the type of sample, rather than the size limited generalizability. A further systematic analysis [ 32 ] of health education research over 20 years demonstrated that interview-based studies averaged participants range 2 to interviewees. An examination of 83 qualitative interview studies in leading information systems journals [ 30 ] indicated little defence of sample sizes on the basis of recommendations by qualitative methodologists, prior relevant work, or the criterion of saturation.

Rather, sample size seemed to correlate with factors such as the journal of publication or the region of study US vs Europe vs Asia. These results led the authors to call for more rigor in determining and reporting sample size in qualitative information systems research and to recommend optimal sample size ranges for grounded theory i.

The notion of saturation was also invoked by 11 out of the 51 most highly cited studies that Guetterman [ 27 ] reviewed in the fields of education and health sciences, of which six were grounded theory studies, four phenomenological and one a narrative inquiry. Finally, analysing interview-based articles in accounting, Dai et al. Despite increasing attention to rigor in qualitative research e. The present study sought to enrich existing systematic analyses of the customs and practices of sample size reporting and justification by focusing on qualitative research relating to health.

Additionally, this study attempted to expand previous empirical investigations by examining how qualitative sample sizes are characterised and discussed in academic narratives. Qualitative health research is an inter-disciplinary field that due to its affiliation with medical sciences, often faces views and positions reflective of a quantitative ethos. Thus qualitative health research constitutes an emblematic case that may help to unfold underlying philosophical and methodological differences across the scientific community that are crystallised in considerations of sample size.

The present research, therefore, incorporates a comparative element on the basis of three different disciplines engaging with qualitative health research: medicine, psychology, and sociology. We chose to focus our analysis on single-per-participant-interview designs as this not only presents a popular and widespread methodological choice in qualitative health research, but also as the method where consideration of sample size — defined as the number of interviewees — is particularly salient.

A structured search for articles reporting cross-sectional, interview-based qualitative studies was carried out and eligible reports were systematically reviewed and analysed employing both quantitative and qualitative analytic techniques.

To be eligible for inclusion in the review, the article had to report a cross-sectional study design. Longitudinal studies were thus excluded whilst studies conducted within a broader research programme e. The method of data collection had to be individual, synchronous qualitative interviews i.

Mixed method studies and articles reporting more than one qualitative method of data collection e. Figure 1 , a PRISMA flow diagram [ 53 ], shows the number of: articles obtained from the searches and screened; papers assessed for eligibility; and articles included in the review Additional File 2 provides the full list of articles included in the review and their unique identifying code — e. One review author KV assessed the eligibility of all papers identified from the searches.

When in doubt, discussions about retaining or excluding articles were held between KV and JB in regular meetings, and decisions were jointly made. A data extraction form was developed see Additional File 3 recording three areas of information: a information about the article e.

This was directly copied from the articles and, when appropriate, comments, notes and initial thoughts were written down. To examine the kinds of sample size justifications provided by articles, an inductive content analysis [ 54 ] was initially conducted. On the basis of this analysis, the categories that expressed qualitatively different sample size justifications were developed.

A thematic analysis [ 55 ] was then performed on all scientific narratives that discussed or commented on the sample size of the study. These narratives were evident both in papers that justified their sample size and those that did not.

To identify these narratives, in addition to the methods sections, the discussion sections of the reviewed articles were also examined and relevant data were extracted and analysed. Table 1 provides basic information about the sample sizes — measured in number of interviews — of the studies reviewed across the three journals.

Figure 2 depicts the number of eligible articles published each year per journal. Number of eligible articles published each year per journal Footnote The publication of qualitative studies in the BMJ was significantly reduced from onwards and this appears to coincide with the initiation of the BMJ Open to which qualitative studies were possibly directed. Ten There was no association between the number of interviews i.

If an article was published in the BJHP, the odds of providing a justification were 4. Similarly if published in the BMJ, the odds of a study justifying its sample size were 4. The qualitative content analysis of the scientific narratives identified eleven different sample size justifications.

These are described below and illustrated with excerpts from relevant articles. By way of a summary, the frequency with which these were deployed across the three journals is indicated in Table 3.

Saturation was the most commonly invoked principle Thirty three women were approached to take part in the interview study. Twenty seven agreed and 21 aged 21—64, median 40 were interviewed before data saturation was reached one tape failure meant that 20 interviews were available for analysis. Two articles reported pre-determining their sample size with a view to achieving data saturation BMJ08 — see extract in section In line with existing research ; BMJ15 — see extract in section Pragmatic considerations without further specifying if this was achieved.

One article BMJ18 cited a reference to support its position on saturation. Recruitment continued until data saturation was reached, defined as the point at which no new themes emerged. Two studies argued that they achieved thematic saturation BJHP28 — see extract in section Sample size guidelines ; BJHP31 and one BJHP30 article, explicitly concerned with theory development and deploying theoretical sampling, claimed both theoretical and data saturation.

The final sample size was determined by thematic saturation, the point at which new data appears to no longer contribute to the findings due to repetition of themes and comments by participants Morse, At this point, data generation was terminated. BJHP17 referred descriptively to a state of achieved saturation without specifically using the term. Saturation of coding , but not saturation of themes, was claimed to have been reached by one article BJHP Two articles explicitly stated that they did not achieve saturation; instead claiming a level of theme completeness BJHP27 or that themes being replicated BJHP53 were arguments for sufficiency of their sample size.

Furthermore, data collection ceased on pragmatic grounds rather than at the point when saturation point was reached. Despite this, although nuances within sub-themes were still emerging towards the end of data analysis, the themes themselves were being replicated indicating a level of completeness. Finally, one article criticised and explicitly renounced the notion of data saturation claiming that, on the contrary, the criterion of theoretical sufficiency determined its sample size BJHP According to the original Grounded Theory texts, data collection should continue until there are no new discoveries i.

For this study, it was decided that theoretical sufficiency would guide recruitment, rather than looking for data saturation. Ten out of the 20 BJHP articles that employed the argument of saturation used one or more citations relating to this principle. Three articles described a state of achieved saturation without using the term or specifying what sort of saturation they had achieved i. Recruitment and analysis ceased once theoretical saturation was reached in the categories described below Lincoln and Guba One article stated that thematic saturation was anticipated with its sample size SHI Finally, SHI see extract in section Further sampling to check findings consistency argued that it achieved saturation of discursive patterns.

Seven of the 19 SHI articles cited references to support their position on saturation see Additional File 4 for the full list of citations used by articles to support their position on saturation across the three journals. Overall, it is clear that the concept of saturation encompassed a wide range of variants expressed in terms such as saturation, data saturation, thematic saturation, theoretical saturation, category saturation, saturation of coding, saturation of discursive themes, theme completeness.

It is noteworthy, however, that although these various claims were sometimes supported with reference to the literature, they were not evidenced in relation to the study at hand. The determination of sample size on the basis of pragmatic considerations was the second most frequently invoked argument 9.

In the BMJ, one article BMJ15 appealed to pragmatic reasons, relating to time constraints and the difficulty to access certain study populations, to justify the determination of its sample size.

We set a target of seven to 10 caregivers per site because of time constraints and the anticipated difficulty of accessing caregivers at some home based care services. This gave a target sample of 75— patients and 35—50 caregivers overall. We had aimed to continue interviewing until we had reached saturation, a point whereby further data collection would yield no further themes.

In practice, the number of individuals volunteering to participate dictated when recruitment into the study ceased 15 young people, 15 parents. Nonetheless, by the last few interviews, significant repetition of concepts was occurring, suggesting ample sampling. Finally, three SHI articles explained their sample size with reference to practical aspects: time constraints and project manageability SHI56 , limited availability of respondents and project resources SHI , and time constraints SHI The size of the sample was largely determined by the availability of respondents and resources to complete the study.

Its composition reflected, as far as practicable, our interest in how contextual factors for example, gender relations and ethnicity mediated the illness experience. This sample size justification 8. The level of analysis — i. We stopped recruitment when we reached 30—35 interviews, owing to the depth and duration of interviews, richness of data, and complexity of the analytical task.

Meeting sampling requirements 7. Achieving maximum variation sampling in terms of specific interviewee characteristics determined and explained the sample size of two BMJ studies BMJ02; BMJ16 — see extract in section Meet research design requirements. Recruitment continued until sampling frame requirements were met for diversity in age, sex, ethnicity, frequency of attendance, and health status.

Regarding the SHI articles, two papers explained their numbers on the basis of their sampling strategy SHI see extract in section Saturation ; SHI23 whilst sampling requirements that would help attain sample heterogeneity in terms of a particular characteristic of interest was cited by one paper SHI Of the fifty interviews conducted, thirty were translated from Spanish into English.

These thirty, from which we draw our findings, were chosen for translation based on heterogeneity in depressive symptomology and educational attainment. Finally, the pre-determination of sample size on the basis of sampling requirements was stated by one article though this was not used to justify the number of interviews SHI Sample size guidelines suggested a range between 20 and 30 interviews to be adequate Creswell, Interviewer and note taker agreed that thematic saturation, the point at which no new concepts emerge from subsequent interviews Patton, , was achieved following completion of 20 interviews.

Interviewing continued until we deemed data saturation to have been reached the point at which no new themes were emerging. Researchers have proposed 30 as an approximate or working number of interviews at which one could expect to be reaching theoretical saturation when using a semi-structured interview approach Morse , although this can vary depending on the heterogeneity of respondents interviewed and complexity of the issues explored.

Sample sizes of published literature in the area of the subject matter under investigation 3. We drew participants from a list of prisoners who were scheduled for release each week, sampling them until we reached the target of 35 cases, with a view to achieving data saturation within the scope of the study and sufficient follow-up interviews and in line with recent studies [8—10].

Similarly, BJHP38 see extract in section Qualities of the analysis claimed that its sample size was within the range of sample sizes of published studies that use its analytic approach. BMJ21 see extract in section Qualities of the analysis and SHI32 referred to the richness, detailed nature, and volume of data collected 2. Although there were more potential interviewees from those contacted by postcode selection, it was decided to stop recruitment after the 10th interview and focus on analysis of this sample.

The material collected was considerable and, given the focused nature of the study, extremely detailed. Determination of sample size so that it is in line with, and serves the requirements of, the research design 2. We aimed for diverse, maximum variation samples [20] totalling 80 respondents from different social backgrounds and ethnic groups and those bereaved due to different types of suicide and traumatic death. A sample of eight participants was deemed appropriate because of the exploratory nature of this research and the focus on identifying underlying ideas about the topic.

Finally, SHI argued that once it had achieved saturation of discursive patterns, further sampling was decided and conducted to check for consistency of the findings. Within each of the age-stratified groups, interviews were randomly sampled until saturation of discursive patterns was achieved.

This resulted in a sample of 67 interviews. Once this sample had been analysed, one further interview from each age-stratified group was randomly chosen to check for consistency of the findings.

This analysis resulted in two overarching thematic areas; the first concerned the variation in the characterisation of sample size sufficiency, and the second related to the perceived threats deriving from sample size insufficiency. The current study has a number of limitations. The study has two principal limitations.

The first of these relates to the small number of respondents who took part in the study. It seemed that the imagined audience — perhaps reviewer or reader — was one inclined to hold the tenets of quantitative research, and certainly one to whom it was important to indicate the recognition that small samples were likely to be problematic.

Very occasionally, the articulation of the small size as a limitation was explicitly aligned against an espoused positivist framework and quantitative research. This study has some limitations. Firstly, the incidents sample represents a small number of the total number of serious incidents that occurs every year. Our lack of epidemiological knowledge about healthcare incidents, however, means that determining an appropriate sample size continues to be difficult.

Our numbers are small because negotiating access to social networks was slow and labour intensive, but our methods generated exceptionally rich data. This study could be criticised for using a small and unrepresentative sample. Given that older adults have been ignored in the research concerning suntanning, fair-skinned older adults are the most likely to experience skin cancer, and women privilege appearance over health when it comes to sunbathing practices, our study offers depth and richness of data in a demographic group much in need of research attention.

Only four articles expressed some degree of confidence that their achieved sample size was sufficient. For example, SHI, in line with the justification of thematic saturation that it offered, expressed trust in its sample size sufficiency despite the poor response rate.

Similarly, BJHP04, which did not provide a sample size justification, argued that it targeted a larger sample size in order to eventually recruit a sufficient number of interviewees, due to anticipated low response rate.

Twenty-three people with type I diabetes from the target population of i. The relatively low response rate was anticipated, due to the busy life-styles of young people in the age range, the geographical constraints, and the time required to participate in a semi-structured interview, so a larger target sample allowed a sufficient number of participants to be recruited.

Nevertheless, claims of sample size sufficiency were sometimes undermined when they were juxtaposed with an acknowledgement that a larger sample size would be more scientifically productive. Although our sample size was sufficient for this exploratory study, a more diverse sample including participants with lower socioeconomic status and more ethnic variation would be informative.

A larger sample could also ensure inclusion of a more representative range of apps operating on a wider range of platforms. The type of generalisation aspired to BJHP48 was not further specified however. This study used rich data provided by a relatively large sample of expert informants on an important but under-researched topic.

This study had a large diverse sample, recruited through a range of locations and used in-depth interviews which enhance the richness and generalizability of the results. For example, BJHP32 below provides a rationale for how an IPA study can accommodate a large sample size and how this was indeed suitable for the purposes of the particular research. To strengthen the explanation for choosing a non-normative sample size, previous IPA research citing a similar sample size approach is used as a precedent.

Small scale IPA studies allow in-depth analysis which would not be possible with larger samples Smith et al. Although IPA generally involves intense scrutiny of a small number of transcripts, it was decided to recruit a larger diverse sample as this is the first qualitative study of this population in the United Kingdom as far as we know and we wanted to gain an overview.

However, the emphasis changes from an in-depth individualistic analysis to one in which common themes from shared experiences of a group of people can be elicited and used to understand the network of relationships between themes that emerge from the interviews. This large-scale format of IPA has been used by other researchers in the field of false-positive research.

Baillie, Smith, Hewison, and Mason conducted an IPA study, with 24 participants, of ultrasound screening for chromosomal abnormality; they found that this larger number of participants enabled them to produce a more refined and cohesive account. As shown above, the majority of articles that commented on their sample size, simultaneously characterized it as small and problematic. Other features related to the sample — often some kind of compositional particularity — were also linked to limited potential for generalisation.

It must be noted that samples are small and whilst in both groups the majority of those women eligible participated, generalizability cannot be assumed. In particular, patients were only recruited from secondary care services where COFP diagnoses are typically confirmed. The sample therefore is unlikely to represent the full spectrum of patients, particularly those who are not referred to, or who have been discharged from dental services. Interestingly, only a minority of articles alluded to, or invoked, a type of generalisation that is aligned with qualitative research, that is, idiographic generalisation i.

One article SHI clearly contrasted nomothetic statistical generalisation to idiographic generalisation, arguing that the lack of statistical generalizability does not nullify the ability of qualitative research to still be relevant beyond the sample studied. Further, these data do not need to be statistically generalisable for us to draw inferences that may advance medicalisation analyses Charmaz These data may be seen as an opportunity to generate further hypotheses and are a unique application of the medicalisation framework.

Although a small-scale qualitative study related to school counselling, this analysis can be usefully regarded as a case study of the successful utilisation of mental health-related resources by adolescents.

As many of the issues explored are of relevance to mental health stigma more generally, it may also provide insights into adult engagement in services. It shows how a sociological analysis, which uses positioning theory to examine how people negotiate, partially accept and simultaneously resist stigmatisation in relation to mental health concerns, can contribute to an elucidation of the social processes and narrative constructions which may maintain as well as bridge the mental health service gap.

Baker, S. How many qualitative interviews is enough? Expert voices and early career reflections on sampling and cases in qualitative research. National Centre for Research Methods , Guest, G. How many interviews are enough? An experiment with data saturation and variability. Field Methods, 18 1 , Mason, M.

My 2 cents…. I agree with Greg. Keep an eye out for a blog post discussing how the level of sample homogeneity — and other factors — might affect thematic saturation, and therefore sample size! Very interesting. Very useful. If I understood the conclusions correctly, it does go back to traditional rule of thumb approaches; i.

The catch, or course, is that one should be carefully reviewing scripts as data is collected. I like the idea that it confirms one may make advance decisions on set targets. Thanks, Jane!

Wonderful overview of the literature, thank you so much for this! However, I always take these numbers with a pinch of salt, saturation will obviously vary greatly with topic area and research question. Yes, absolutely, Daniel! Thanks for your comment and link. Stay tuned for the next post in this series that will address some of the factors that affect saturation — to help identify whether a small or large pinch of salt should go into saturation-based qualitative sample size calculations!

Thanks for this post — really useful. One question I have is, did any of these studies consider the interviewing skills of the interviewer? Or mention what training the interviewers had in advance of doing the semi-structured interviews?

Hi Mia, great questions! As an aside from the sampling discussion, interviewer training really is key to generating good qualitative data — and not just training in terms of interviewing skill, but also in making sure that everyone has a common understanding of the research objectives.

Some people have a more natural affinity for interviewing than others, but if you can provide your interviewers with pretty immediate feedback e. May I ask if you were conducting surveys via email — what would be an acceptable number of completed surveys to aim for?

Thanks for this illuminating post. I have now collected data from 2 different states were I conducted 13 interviews and 3 FGDs in the first and 8 interviews and 1 FGD in the next. When I started collecting data from the second state, I reached data saturation much quicker and for the FGDs i was not getting much different data from the first state.

I am now wondering if I have taken the right approach. Was the interviews and FGDs recommended per each round of data collection? Hi Abisola, Yes, it sounds like you interpreted the recommendations correctly — that those sample sizes are per sub-population of interest. In your case, I would have considered the two states as two sub-populations, as you did. Emily and Greg, this is brilliant!

Just what I needed today, and described in such simplistic and fun way. Your email address will not be published. Notify me of follow-up comments by email. Notify me of new posts by email. This site uses Akismet to reduce spam. Learn how your comment data is processed.



0コメント

  • 1000 / 1000