Edit and Imputation

Edit and Imputation ensures the census data is complete and consistent at record level. This process fills in the blanks where individual questions have not been answered and tidies up inconsistencies between answers.

Edit

Editing is the process of locating and flagging missing, invalid and inconsistent values in the census dataset. It consists of Hard Edits and Soft Edits.

Hard Edits: Rules which identify and flag impossible or highly implausible values or combinations of values. For example:

“A person under the age of 17 cannot drive to school or work”

While there may be genuine cases matching these conditions, the vast majority of observed cases will be due to error and we will want to remove these cases from the dataset. These edits should be documented and published for data users.

Soft Edits: Rules which identify and flag plausible but rare observations. For example, numerical outliers such as a person under the age of 45 being retired. These are observations which we do not want to remove from the dataset, but we also do not want to multiply these values during the imputation process. These edits should be documented and published for data users.

Imputation

Imputation is the process of replacing these flagged missing, invalid and inconsistent responses with substituted values. There are two levels of imputation:

Item imputation: A variable-level adjustment of real households, household persons, or communal establishment persons following the edit of the dataset.

Item imputation may involve a limited number of deterministic changes, which decide the likely substitute value directly using other values in the dataset. However the majority of item imputation includes an element of randomness, using a method called nearest-neighbour hot-deck imputation. This process finds similar records in the census dataset to donate values to the flagged record in order to resolve inconsistencies or replace missing or invalid values.

For example, if a person has not responded to the occupation question, we use other information from their return such as the industry they work in, the number of hours they work, and their qualifications, to find a similar record in the census dataset, from which we copy the response for occupation to fill in the blank.

Unit imputation: The addition of a synthetic household, household person, or communal establishment person into the dataset to account for individuals and households not counted in the census

When we mention “edit and imputation” in methodology for Scotland’s Census 2022, we are referring to item-level edit and imputation. Unit imputation is referred to as Coverage Adjustment.

More information on Edit and Imputation can be found in the External Methodology Assurance Panel document PMP013: Edit and Imputation Methodology.

Statistical Methodology Rehearsal 2020

From April to June 2020, National Records of Scotland (NRS) carried out a rehearsal of several statistical methodologies for Scotland’s Census 2022. The Edit and Imputation methodology was tested during this rehearsal.

The evaluation reports from this rehearsal can be found on the Statistical Methodology Rehearsal 2020 web page, which will be published in September 2020.

Statistical Quality Assurance

NRS has published a Statistical Quality Assurance Strategy which provides more information on how the Edit and Imputation process, and others, will be quality assured.

UK Harmonisation

NRS aims to harmonise statistical methodologies with other UK census offices as much as possible. We share ideas and provide feedback on methodologies through harmonisation working groups with the Office of National Statistics (ONS) and the Northern Ireland Statistics and Research Agency (NISRA).

For more information on how NRS work with the other UK censuses to harmonise our statistical methodology for UK Data Users, as well as share best practice and lessons learned, please see the UK Census Data tab of this website.

Stakeholder Events

In February 2020, NRS ran Statistical Methodology Stakeholder events aimed at the general public and data users. These events gave attendees a high level overview of what happens to census data from when NRS receives census responses through to producing the outputs.

During these events, we sought feedback to help to further develop our plans to ensure the highest quality of outputs for our users. Slides from the event are published on the event page of the Scotland’s Census 2022 website.

Get Involved

NRS are seeking feedback on the content provided on these pages to help us provide the information most relevant to you. To share your views and information needs, please contact scotlandscensus@nrscotland.gov.uk).