Scotland's People: Results from the 2001/2002 Scottish Household Survey (Volume 8: Technical Report)

Listen

Scotland's People: results from the 2001/2002 Scottish Household Survey
Volume 8: Technical Report

4.4 Achieved sample profile and weighting issues

Two types of weighting are potentially necessary with a random probability sample of this kind. The first are intrinsic to the survey design and are necessary to compensate/adjust for unequal probabilities of selection for individuals, households or other units of analysis. The second are extrinsic to the survey design but may be necessary to counteract the effects of non-response bias.

Weighting for analysis based on household data

The weights for analysis of household data have two main elements. Firstly, it is necessary to 'weight up' those local authorities which were under-sampled and 'weight down' those which were over-sampled (this is a weight of the first type mentioned above, which adjusts for unequal probabilities of selection). Secondly, to weight addresses any disproportionality introduced by differential response rates by local authority within quarters. The combination of these two elements is shown in Table 4-5. (The weights for some local authorities vary between one quarter and the next because the number of achieved interviews fluctuate between quarters - see Section 4.1). The final sample profile across the two years should, therefore, correctly reflect the distribution of households across Scotland's local authority areas, as estimated by the Scottish Executive 5.

Weights are calculated for each local authority each quarter and based on the quarter in which the interviews were undertaken (regardless of when the address was sampled). This reflects the need for the data to be nationally representative in each quarter and should allow any published findings to be reproduced by selecting the relevant quarter's data. In practice, however, it may not be possible to reproduce exactly some of the results from earlier publications if the data for that quarter were subsequently changed (e.g. to correct errors that were identified later).

Table 4-5 Weights to account for 'under'/'over' sampling and differences in response rates by local authority by quarter: SHS 2001/2002

2001

2002

Q1

Q2

Q3

Q4

Q1

Q2

Q3

Q4

Aberdeen City

1.171

1.094

1.001

1.039

1.098

0.934

0.981

1.402

Aberdeenshire

1.216

1.042

1.023

0.982

1.106

0.924

0.758

1.175

Angus

0.975

1.099

1.068

0.988

0.844

0.817

1.087

1.066

Argyll and Bute

1.017

0.871

0.956

0.747

0.820

0.899

1.075

1.274

Clackmannanshire

0.544

0.461

0.522

0.550

0.481

0.654

0.503

0.514

Dumfries and Galloway

0.946

1.313

1.120

1.004

0.975

0.938

0.993

1.455

Dundee City

1.102

1.093

1.063

1.261

1.085

1.105

0.943

1.135

East Ayrshire

0.943

0.790

0.907

1.340

1.078

1.035

0.873

1.290

East Dumbartonshire

1.123

1.124

1.037

0.677

0.824

0.834

0.923

1.390

East Lothian

0.914

0.934

0.853

0.937

0.836

0.843

0.786

0.920

East Renfrewshire

1.062

0.898

0.864

0.706

1.009

1.047

1.012

0.777

Edinburgh City

1.123

1.085

1.293

1.158

1.311

1.165

1.174

1.369

Eilean Siar

0.285

0.223

0.232

0.233

0.242

0.536

0.261

0.202

Falkirk

1.120

1.225

1.072

0.963

1.325

1.030

0.948

1.009

Fife

1.217

1.105

1.144

1.138

1.056

1.009

1.148

0.859

Glasgow City

1.087

1.161

1.202

1.228

1.310

1.336

1.158

1.041

Highland

1.012

1.130

1.011

1.016

0.926

0.876

1.313

0.989

Inverclyde

0.957

0.958

0.893

0.925

1.134

1.056

1.056

1.155

Midlothian

0.736

0.651

0.656

0.864

0.729

0.608

0.616

0.887

Moray

0.794

0.807

0.782

0.869

0.686

0.779

1.035

0.798

North Ayrshire

1.084

0.938

1.280

0.904

1.211

1.088

0.986

1.194

North Lanarkshire

1.120

1.051

1.139

1.143

1.133

1.091

1.175

1.070

Orkney

0.195

0.248

0.175

0.216

0.171

0.160

0.165

0.135

Perth and Kinross

1.045

0.970

1.241

1.254

0.955

1.178

0.835

1.169

Renfrewshire

1.162

1.195

1.320

0.876

1.063

1.454

1.228

0.868

Scottish Borders

1.193

1.121

0.904

0.832

0.934

0.782

0.947

1.411

Shetland

0.189

0.283

0.194

0.221

0.188

0.152

0.206

0.221

South Ayrshire

0.997

1.057

0.901

1.085

0.954

0.891

0.978

1.098

South Lanarkshire

0.937

1.019

1.198

1.259

1.000

1.141

1.285

0.978

Stirling

1.007

0.921

0.588

0.805

0.992

0.698

0.515

1.018

West Dumbartonshire

0.956

1.062

0.870

1.414

1.898

1.072

1.193

0.727

West Lothian

0.983

1.115

1.044

1.274

1.056

1.164

1.312

1.019

No other weight is applied across all cases in order to compensate/adjust for the unequal probabilities of selection. Strictly speaking, however, a corrective weight should be applied in those cases in which the Multiple Occupancy Indicator (MOI) on the PAF is found to be inaccurate. The reason for this is that a property-type bias might otherwise be introduced. For example, if tenement properties were consistently found to contain multiple dwellings when the MOI used in the selection of addresses for the sample had indicated that they contained just one, each achieved interview at such an address should be given a weight proportional to the actual number of dwellings, to compensate for the reduced probability of selection for each dwelling at that address. All properties within that local authority area should then be weighted back down slightly in order that the actual and weighted sample sizes remain the same.

In practice, the MOI was found to be inaccurate in 2.6% of cases. The impact of weighting to correct for these would have been negligible so it was decided not to weight by the MOI in order to avoid additional complexity in the weighting scheme for the survey. This issue is reviewed on an annual basis.

Similarly, in theory an additional weight should be applied in cases where a dwelling contains more than one household, only one of which is interviewed, in order to adjust for the lower probability of selection for each of the households in that dwelling. In practice, however, as only a very small number of dwellings were found to contain more than one household, the use of such a weight would have made very little difference to the overall results, and it was therefore felt that it was not worthwhile introducing further complication to the weighting calculations.

Weighting for analysis based on individual (random adult) data

Using the Postcode Address File produces a sample of households, so for analysis of individual level data it is also necessary to weight the responses of the 'random adult' by the number of adults resident in the household who were eligible for interview 6. The reason for this is that individuals living in larger households have a lower probability of selection.

As a result of this, one would expect the unweighted profile of 'random adult' respondents to be skewed towards those sections of the population most likely to live in households with fewer adults (older people and older females in particular) and away from those likely to live in households with larger numbers of adults (younger people). Once the data are weighted by the number of eligible adults in the household, however, one should see the profile correct itself significantly. In most surveys of this kind, however, some under-representation of younger people and males, and over-representation of older people and females, is likely to remain because of the effects of non-response bias. Depending on the extent of the remaining skew, it may be necessary to adopt further corrective measures - an issue considered in detail in the following section.

Analysis of data based on the 'random adult' also requires a corrective weight to take account of differences between the number of such interviews completed in each local authority area and the actual adult population of such areas. Like the element of the household data weight which adjusts for differences in fieldwork outcomes by local authority, this is intended not to compensate for unequal probabilities of selection but to ensure that the final profile of 'individual' data correctly reflects the relative populations of the different local authority areas once variations in fieldwork outcomes have been assessed. This is not identical to the weight described for analysis of household data, since variation in response rates for the second part of the interview may have produced a slightly different distribution from that of 'highest income householder' interviews. The weights required for each local authority (which are then multiplied by the number of adults in the household to create the weight for each case) are summarised below.

Table 4-6 Average weighting factors to adjust for under and over-sampling and differences in random adult response rates by local authority

2001

2002

Q1

Q2

Q3

Q4

Q1

Q2

Q3

Q4

Aberdeen City

1.100

1.033

0.953

0.979

1.021

0.835

0.940

1.365

Aberdeenshire

1.122

0.998

0.943

0.916

1.038

0.850

0.697

1.112

Angus

0.922

1.025

0.987

0.927

0.828

0.751

1.007

0.988

Argyll and Bute

0.970

0.796

0.877

0.726

0.763

0.843

1.019

1.158

Clackmannanshire

0.493

0.449

0.483

0.506

0.456

0.601

0.466

0.502

Dumfries and Galloway

0.904

1.208

1.057

0.949

0.926

0.872

0.899

1.361

Dundee City

1.086

1.034

0.980

1.171

1.025

1.018

0.867

1.028

East Ayrshire

0.878

0.747

0.840

1.289

1.021

0.912

0.836

1.235

East Dumbartonshire

1.057

1.055

0.955

0.654

0.761

0.836

0.850

1.221

East Lothian

0.875

0.903

0.793

0.874

0.784

0.763

0.755

0.838

East Renfrewshire

1.025

0.850

0.794

0.682

0.936

0.966

0.882

0.749

Edinburgh City

1.058

1.016

1.213

1.084

1.249

1.082

1.101

1.246

Eilean Siar

0.271

0.210

0.219

0.214

0.226

0.494

0.256

0.178

Falkirk

1.042

1.164

1.037

0.887

1.240

0.965

0.888

0.949

Fife

1.126

1.075

1.069

1.071

1.012

0.961

1.039

0.780

Glasgow City

1.058

1.047

1.136

1.147

1.259

1.254

1.058

0.956

Highland

0.940

1.093

0.937

0.953

0.870

0.810

1.199

0.940

Inverclyde

0.898

0.918

0.833

0.867

1.071

0.967

1.013

1.039

Midlothian

0.739

0.597

0.622

0.794

0.682

0.550

0.572

0.842

Moray

0.764

0.753

0.741

0.813

0.642

0.692

0.961

0.775

North Ayrshire

1.020

0.902

1.216

0.823

1.144

1.040

0.833

1.111

North Lanarkshire

1.042

0.993

1.099

1.053

1.082

0.983

1.120

0.990

Orkney

0.187

0.233

0.164

0.199

0.160

0.144

0.150

0.129

Perth and Kinross

0.994

0.935

1.141

1.176

0.905

1.105

0.739

1.108

Renfrewshire

1.105

1.155

1.255

0.788

0.965

1.414

1.089

0.796

Scottish Borders

1.096

1.047

0.867

0.784

0.933

0.723

0.883

1.280

Shetland

0.177

0.267

0.182

0.206

0.174

0.140

0.191

0.205

South Ayrshire

0.955

0.981

0.832

1.017

0.884

0.815

0.916

1.039

South Lanarkshire

0.896

0.957

1.115

1.173

0.953

1.066

1.141

0.936

Stirling

0.968

0.858

0.552

0.755

0.959

0.652

0.486

0.910

West Dumbartonshire

0.901

1.016

0.834

1.353

1.749

1.040

1.110

0.691

West Lothian

0.918

1.046

0.973

1.187

0.968

1.052

1.239

0.975

Weighting for analysis based on the 'random schoolchild'

There is one further weighting factor needed to adjust for unequal probabilities of selection, relating to the information collected about a 'random schoolchild'. For this information to represent correctly the population of schoolchildren resident within households, it should be weighted by the total number of eligible schoolchildren resident within each household. If not, it will proportionately over-represent the characteristics and experiences of 'only' children and under-represent those of children from larger families. The weight for the random schoolchild case is created by combining this weighting and the relevant local authority weight.

Weighting for analysis based on the Travel Diary

Examination of the SHS data suggests that significantly fewer interviews take place on Fridays, Saturdays and Sundays than on other days of the week. As differences in the proportions of adults interviewed on each day of the week will affect the Travel Diary data's representativeness of travel patterns for the week as a whole, it was decided to introduce a weight to compensate for this. This simply 'up-weights' interviews carried out on days of the week on which fewer than one-seventh of all interviews have taken place and 'down-weights' those carried out on days on which more than one-seventh of all interviews have been completed.

It is also apparent that the distribution of interviews by the day of the week differs for certain sub-sections of the adult population. For example, disproportionately more adults in full-time employment are interviewed at the weekend (due to their greater availability then), thus yielding an inaccurate picture of the travel patterns of those in full-time employment. The Travel Diary weighting factor is therefore refined to compensate for this.

The weight created for any analysis of the Travel Diary combines the above weighting factors and the existing 'random adult' weights. Further information about the Travel Diary, including a comparison to the National Travel Survey, is available in the Travel Diary User Guide 7.

4.5 Data quality and comparisons with external sources

We turn now to the issue of whether additional post-survey weighting is required to address any residual bias in the sample profile (arising, for example, from differential patterns of non-response across sections of the sampled population).

Age and sex profile of the 'random adult' sample

We saw earlier that the unweighted sample automatically under-represents those living in multi-adult households, since they have a smaller chance of selection for interview. As Table 4-7 shows, therefore, the weighting to equalise probabilities of selection has a significant effect on the profile of the 'random adult' sample. The data shown have been weighted both by the number of adults resident in the household and by the local authority weight described in the previous section. These two weights tend to act in the same direction, since those larger local authority areas which are 'weighted up' also tend to be ones with a higher average household size.

Table 4-7 Comparison of weighted and unweighted age and sex profile of 2001/2002 SHS data with Census estimates

Census estimates for 29 April 2001

SHS
Unweighted

SHS
Weighted*

%

%

%

Male

(n=12,173)

(n=12,676)

16 - 24

7.0

3.5

4.9

25 - 59

29.3

25.3

26.4

60 plus

11.0

13.7

12.9

Total

47.3

42.5

44.2

Female

(n=16,511)

(n=16,009)

16 - 24

6.9

4.6

5.8

25 - 59

30.7

32.2

33.3

60 plus

15.1

20.8

16.7

Total

52.7

57.6

55.8

All adults

(n=28,684)

(n=28,685)

16 - 24

13.9

8.1

10.7

25 - 59

60.1

57.4

59.7

60 plus

26.1

34.5

29.6

Total

100.0

100.0

100.0

* Weighted by number of adults and local authority size

The weighted sample for 2001/2002 still does not match exactly the profile of the adult population suggested by the Census estimates with, as expected, under-representation of younger people in general and 16-24 year olds in particular. Consequently, older people are over-represented in the survey.

However, there are some reasons for being cautious about seeking to 'correct' the remaining imbalances. Firstly, the survey estimates of the age and sex profile of the sample are - like all its estimates - subject to sampling error. The 95% confidence interval for the percentage of males aged 16 to 24, for example, is likely to be in the region of 1.3% - slightly less than the difference between the SHS sample percentage and the percentage suggested by the Census. Secondly, the age/sex profile of the random adult sample is not greatly different from the profile of all adults resident within the households at which an interview was carried out. This suggests that non-response to the second part of the interview has not contributed significantly to a skewing of the 'random adult' sample. Finally, there are good reasons to think that the population that is outside the scope of the survey - student halls, nurses homes, prisons, army barracks etc, will disproportionately contain adults of the age groups that appear to be under-represented. Conversely, older people will be disproportionately represented in the population in hospitals and nursing homes. Thus, although the sample might differ from the profile of the population as a whole, it might not differ greatly from the profile of the population in private households.

However, the Census does provide an opportunity to establish a more thorough weighting scheme for the survey by allowing comparison of the characteristics of responding and non-responding households. The SHS has been included in the Census-linked study of survey non-response that is being carried out by ONS. The results of this study will be used to develop further weighting for the SHS.

The following sub-sections examine this issue further through a comparison of other key household variables with information from other sources.

Household type, property type, tenure and number of bedrooms

We noted above in Table 4-7 that the SHS appears to under-represent young adults and over-represent older adults. This is also apparent when household types in the Census are compared with the Census (Table 4-8).

Table 4-8 Comparison of household types in the 2001/2002 SHS and the 2001 Census

2001 Census

SHS 2001/2002 *

%

%

(n=2,192,246)

(n=30,639)

Single adult

17.9

15.3

Small adult

16.9

16.9

Single parent

5.6

5.9

Small family

13.3

14.1

Large family

7.1

7.0

Large adult

11.2

9.5

Older smaller

13.0

14.9

Single pensioner

15.0

16.3

* SHS data weighted by local authority size only

As Table 4-9 shows, the sample appears robust in terms of the variables associated with accommodation/property characteristics. There is a slight over-representation of outright-owners relative to the Census and under-representation of 'other' tenures. Other differences from the census are only one percentage point.

Table 4-9 Comparison of key variables between the 2001 Census, 1996 SHCS and the 2001/2002 SHS

2001 Census

2001/2002 SHS

(n= 2,192,246)

(n=30,639)

%

%

Property type*

House or bungalow

64

63

Detached

20

19

Semi-detached

23

22

Terraced

20

22

Flat, Maisonette or Apartment

35

36

Other

1

0

Tenure*

Own outright

23

26

Own with mortgage

39

39

Rent

34

35

Local authority/Scottish Homes

22

23

Housing Association/Co-operative

6

6

Private rented

7

6

Other

4

2

Number of bedrooms*

1996 SHCS
(n=19,892)

One

15

14

Two

38

37

Three

36

37

Four

7

10

Five

2

2

Six or more

2

1

* SHS data weighted by local authority size only includes households in shared dwellings

Pays part rent and mortgage (shared ownership) included in 'Own with mortgage'

Page updated: Friday, March 31, 2006