ScotStat Board 8th Meeting Paper 7/05

SCOTSTAT BOARD (ScotStat 7/05)

Update on Disclosure Control for Scotstat board August 2005

Summary

  • The Office of the Chief Statistician (OCS) have carried out a review of Scottish Neighbourhood Statistics (SNS) data providers' current disclosure control procedures and requirements.
  • Tau-Argus has been chosen as the most suitable software for post tabulation disclosure control.
  • OCS are working with the Office of National Statistics (ONS) and other UK government departments and European colleagues on improving Tau-Argus and creating methodology and software to help data providers assess and minimise the risk of disclosure.
  • A disclosure control review is planned for all National Statistics produced by the Scottish Executive at the end of 2005.

Introduction

In the last decade there has been a massive increase in the electronic storage of data, wider access to data on the web, including data at small geographical areas. Computing expertise and access to large IT processing power have increased. This means data publishers need to take increased precautions so that released micro-data and tabulations do not reveal any identifiable information about a unit, be it an individual, household or business.

The National Statistics Protocol on Data Access and Confidentiality states that the Confidentiality Guarantee is met when: "It would take a disproportionate amount of time, effort and expertise for an intruder to identify a statistical unit to others, or to reveal information about that unit not already in the public domain." So disclosure control is about risk management and the methods used should aim to minimise the reduction to data access, utility and quality.

Current Disclosure Control Work

OCS are working to build a set of methods, tools and processes which help identify disclosure risk and as necessary reduce it to an acceptably low level.

There are two distinct groups of disclosure control methods.

  • Pre-tabulation methods including perturbation of the data (data swapping), recoding, sampling and addition of noise.
  • Post-tabulation methods including table redesign which can involve a change in the grouping of the data, setting of thresholds, cell suppression, and rounding.

Existing practices and requirements for disclosure control for small area data actually or potentially published on the SNS website were first assessed and software which could assist data publishers in maintaining confidentiality were examined.

Disclosure Control Software

There are two Argus software programs developed on a European collaborative basis which help the user apply disclosure control methods. Tau-Argus which uses micro data or tabular data but applies post-tabulation disclosure control and Mu-Argus which applies pre-tabulation disclosure control methods to micro-data. Tau Argus has been developed more intensively and has been chosen as a developing software package which will best fit the requirements for post-tabulation methods of disclosure control for the SE and will be discussed in more detail here. There are some reservations about the capabilities of Mu-Argus and effective solutions for pre-tabular disclosure control are still being sort.

Tau-Argus aids the assessment of cells requiring disclosure control using threshold, dominance and probability rules. The software user can then apply control methods such as recoding and controlled rounding. The latter has the advantage over simple rounding of being more difficult to unpick and that the totals produced are additive. If suppression is chosen several rules may be used at the same time to decide which cells require initial or primary suppression. As marginal totals are generally published along with the cell values, it is necessary to suppress further cells, called secondary cells, so that the original cell values can not be calculated by subtraction. The person applying the disclosure control can chose from several methods for the suppression of cells which minimise information loss.

Practical demonstrations using the current version of Tau Argus and assistance in the use of the program are being given to SNS data providers and other interested government bodies throughout 2005. SE continues to test and contribute to the enhancement of the software. Desktop instructions, including worked examples using Scottish data, are being developed. It is intended that these instructions will be made freely available. Currently the Tau-Argus program can only be used on a stand alone machine as it has not been assessed for compatibility with the SE computer operating system Scots 3 as improved versions are still being released in quick succession. It is envisaged that the improvements will plateau and the package will then be assessed for Scots 3 compatibility in the latter half of 2006 and then it will be implemented across the Statistics group.

The Tau-Argus software is free of charge but to enable the use of the more sophisticated parts of the program, including the controlled rounding, a linear programming add on is required. The recommended linear programming package is Xpress which currently costs £700 per user. More information about Tau-Argus can be found at the CASC website: http://neon.vb.cbs.nl/casc/default.htm

Disclosure Control Review

As part of the Scottish Executive compliance statement for the National Statistics Data Access and Confidentiality Protocol, SE have agreed to carry out a formal disclosure control review of all Scottish Executive National Statistics starting in Winter 2005 which OCS will co-ordinate. This is also part of a strategy aimed at developing standards and guidance for disclosure control within the SE. Data providers will be asked to complete a questionnaire detailing their current disclosure control arrangements and some more in depth reviews will then be conducted face-to face for specific business areas. The data published on the SNS website will additionally be examined by OCS taking into account all the data published at each geographical level. Once the review process is complete disclosure control guidance will be updated accordingly and disseminated to all Scottish National Statistics data providers and other interested parties. An update will be given to the Scotstat board by the middle of 2006. As GRO and ISD have separate compliance statements they will not be included in this review process.

Future Work

More work is required on providing data providers with practical help on assessing the information loss associated with their disclosure control methods and realistic disclosure control methods for micro-data continue to be examined.

Conclusion

The current disclosure control work should support data providers in publishing more information especially at small area level whilst preserving confidentiality. This will assist in maintaining the credibility of Scottish statistics. Comments from Scotstat members are welcomed.

This work is being developed by OCS Branch 3. For further information contact Dette Cowden on 0141 242 5986 or e-mail Dette.Cowden@scotland.gsi.gov.uk.

Page updated: Monday, January 23, 2006