Edit and imputation
Edit and imputation irons out inconsistencies and fills in gaps in census data.
We use editing and imputation where:
- people have not answered individual questions
- there are inconsistencies between answers
Editing is the process of finding and flagging missing, invalid and inconsistent answers to census questions.
We document and publish both hard and soft edits for data users.
Hard edits identify impossible or unlikely combinations of answers.
‘A person under the age of 17 cannot drive to school or work’ is an example of a hard edit rule.
Most of these answers will be the results of errors. We'll remove these answers from the data set.
Soft edits identify possible but rare cases - for example, a retired person under 45 years old.
We do not remove these answers from the dataset, but we do not want to duplicate them during the imputation process.
Imputation comes after editing. It involves replacing these missing, invalid or inconsistent values.
Item imputation is when we need to fill in an answer that is missing or seems out of place with related questions.
For example, a person may have skipped the occupation question. In this case, we would look at their answers for:
- number of hours worked
Using this information, we can find a similar response to these questions on another person's census questionnaire. We can then use their answer for the occupation question to fill in the blank.
This method is called nearest-neighbour hot-deck imputation, and it involves some randomness.
When we refer to 'edit and imputation' for Scotland's Census 2022, we mean item imputation.
We refer to unit imputation as 'coverage adjustment'.
Find out more about estimation and adjustment.