Part 2: Methodology
4 Sampling
4.1 Sampling frame
The sample was extracted by ISD Scotland from their August extract from the Community Health Index ( CHI) register of patients currently registered with a Scottish GP. People who were born after 20 th October 1991 were excluded. Patients registered with 2c special practices were also excluded. (These 11 practices are very small board run practices that specialise in particular patient groups, e.g. practices for homeless people).
4.2 Sample size calculation
Sampling was done within GP practice lists, to aim for sufficient responses to achieve a reasonably reliable result for each practice. The reliability of the result depends on the number of questionnaires returned, and also the variability of the responses.
The sample size that was calculated for each practice was based on the minimum number of responses that would be required to achieve an estimate of a percentage that has a 95% confidence interval with width +/- 7 percentage points, sampled from a finite population. This level of acceptable uncertainty was agreed with stakeholders.
The formula for the minimum sample required is
Where:
N is the number of patients in a practice on the sampling frame
B = z 2p(1-p)/c 2
p = the proportion answering in a certain way (assume 0.5 as this gives maximum variability)
z is 1.96 for a 95% confidence interval (from standard normal distribution)
c gives maximum acceptable size of confidence interval, in this case 0.07 (7 percentage points).
Based on the above, B=196
The table below shows examples of this minimum number of responses required (M) based on the assumptions above for some example practice population sizes.
Practice Population | 200 | 500 | 1,000 | 2,000 | 5,000 | 10,000 | 20,000 |
|---|
Minimum sample required (M) | 99 | 141 | 164 | 179 | 189 | 192 | 194 |
|---|
Percentage of population | 50% | 28% | 16% | 9% | 4% | 2% | 1% |
|---|
In practice, if the underlying proportion is actually higher, or lower, than 0.5, then these numbers of responses would give narrower confidence intervals (or fewer responses would be required for the same accuracy).
The above minimum sample required (M) was increased to account for 2 factors:
- Recipients of the questionnaire may not have contacted their GP practice in the last year. It was assumed that 80% of registered patients would have contacted their GP in the last year. This is approximately the figure that arises from the ISD Practice Team Information www.isdscotland.org/pti, and was also found in the survey of what is important to GP patients carried out for the Better Together programme http://www.scotland.gov.uk/Publications/2008/11/24110311/0), and
- Allowance for people not returning the questionnaire. A response rate of 44% for most practices was assumed, based on a similar survey previously carried out in England. However, there is evidence that people in more deprived areas are less likely to respond to surveys, so the sample size was increased more for practices with more than half of their patients living in Scotland's most deprived areas. A deprived area was defined as a patient residing in one of the 15% most deprived data zones as defined by the SIMD 2006. For these practices, a response rate of 30% (instead of 44%) was assumed.
Therefore sample size for each practice was calculated from
Where R is 0.3 for practices defined as deprived, and 0.44 for all others. This means that an effective response rate of 35% was assumed for the majority, with a rate of 24% assumed for deprived practices.
4.3 Patient Selection
A random sample of the required size (S) from each practice was taken from the sampling frame by ISD Scotland, using the R function "sample".
The R software is free software and we acknowledge the following source:
(R Development Core Team (2007). R: A language and environment for statistical computing. R Foundation for Statistical Computing,Vienna, Austria. ISBN 3-900051-07-0, URLhttp://www.R-project.org.
Further references for this methodology are:
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
Ripley, B. D. (1987) Stochastic Simulation. Wiley.