Skip to main content

Expert Meeting on Statistical Data Editing

03 - 06 October 2022

 

This meeting was organized as part of the Conference of European Statisticians’ work programme for 2022, within the context of the High-Level Group for the Modernisation of Official Statistics.

In particular, the expert meeting explored the following themes:

  • Identifying new methods that can improve the quality and efficiency of editing and imputation;
  • Investigating statistical quality risks arising from using new methods and data sources, and ways to address them;
  • Developing approaches for standardizing and implementing statistical data editing functionalities; and
  • Facilitating the sharing of experiences, ideas and tools for modernizing statistical data editing and imputation processes.
Online meeting

Documents

  Document Title Documents
  Provisional Programme

PDF

  Information Notice 1

PDF

  Report PDF

 

Opening

 

 

 

Presentation on the UNECE High-Level Group for the Modernisation of Official Statistics

Taeke Gjaltema

 

Slides

Session 1a: Modernisation of data editing and statistical production (Part 1)

Session Organizers: Darren Gray (Statistics Canada) and Pedro Revilla (INE, Spain)

 

 

 

Multiple software systems for the editing and imputation process of the 7th General Census of Agriculture

Simona Rosati

Paper

Slides

 

Towards a new integrated uniform production system for business statistics at Statistics Netherlands: quality indicators to guide top-down analysis

Frank Aelen & Anita Vaasen-Otten

Paper

Slides

 

The SCIA system implementing Fellegi and Holt methodology compared to the recent R packages

Simona Rosati

Paper

Slides

 

Towards a new integrated uniform production system for business statistics at Statistics Netherlands: automatic data editing with multiple data sources

Wilco de Jong & Sander Scholtus

Paper

Slides

Keynote presentation

 

 

 

Robust imputation procedures in the presence of influential units in surveys

David Haziza

 

Slides

Session 2: Lightning talks

Session Organizer: Alexander Kowarik (Statistics Austria)

 

 

 

A modern statistical production process based on administrative registers

Ewelina Wójcik

 

Slides

 

e-invoice time series nowcasting with R

Bruno Lima

 

Slides

Session 3: New and emerging methods

Session Organizers: Simona Rosati (Istat, Italy) and Sander Scholtus (Statistics Netherlands)

 

 

 

Machine learning Imputation for Social Surveys – Random forest imputation of ONS’ Household Financial Survey

Mark Edward

Paper

Slides

 

Application of the “SwissCheese” method for the imputation of partial non-response in the Survey on Income and Living Conditions

Michael Leuenberger

Paper

Slides

 

Stacking machine-learning models for anomaly detection: comparing AnaCredit to other banking datasets

Andrea del Monaco

Paper

Slides

 

Discover the hidden validation rules in your data with ‘validatesuggest’

Olav ten Bosch

Paper

Slides

Session 1b: Modernisation of data editing and statistical production (Part 2)

Session Organizers: Simona Rosati (Istat, Italy) and David Salgado (INE, Spain)

 

 

 

Validation rule management

Mark van der Loo

Paper

Slides

 

Banff’s next step: an open-source data editing system for advanced tools and collaboration

Darren Gray

Paper

Slides

 

Growing a Modern Edit and Imputation System

Darcy Miller & Megan Lipke

Paper

Slides

 

Automatic Data Editing and Imputation Experience in 2020 Mexican Census

Edgar Vielma

Paper

Slides

Session 4: Machine Learning /Artificial Intelligence for editing and imputation

Session Organizers: Alexander Kowarik (Statistics Austria) and Daniel Kilchmann (Federal Statistics Office, Switzerland)

 

 

 

Improving statistical data editing with Machine Learning: some use cases in Statistics Spain

Sandra Barragán

Paper

Slides

 

Application of the MissForest algorithm for imputation in the Survey on Income and Living Conditions

Blandine Bianchi

Paper

Slides

 

Robust regression, MissForest and calibration combined with non-linear optimization to impute VAT turnover

Jacques Saliba

Paper

Slides

 

Univariate and multivariate goodness (of fit) of imputation

Maria Thurow & Florian Dumpert

Paper

Slides

 

The imputation of the “Attained Level of Education” in the base register of individuals through Neural Networks using sampling weights

Fabrizio De Fausti

Paper

Slides

Session 5: Use of administrative data for editing and imputation

Session Organizers: Ágnes Andics (Central Statistical Office, Hungary) and Pedro Revilla (INE, Spain)

 

 

 

Data imputation for the purposes of statistical research with the use data from administrative registers

Paweł Murawski

Paper

Slides

 

Producing admin-based property floor area statistics for England and Wales: methods, data and quality

Stephan Tietz & Emily Mason-Apps

Paper

Slides

 

The use of administrative records for data imputation in Mexico's Economic Censuses

José Luis Mercado Hernández

Paper

Slides

Session 6: Quality

Session Organizers: Darren Gray (Statistics Canada) and Sander Scholtus (Statistics Netherlands)

 

 

 

Experimental Short-Term Statistics based on Data Imputation Methods

Jan Ditscheid

Paper

Slides

 

Variance estimation for the mass imputation of the “Attained level of education” in the Italian Base Register of individuals: A comparison between analytical and MonteCarlo estimates

Romina Filippini

Paper

Slides

 

Comparison between Clark and Kokic and Bell approaches in winsorization

Romain Lesauvage

Paper Slides

 

Automatic selective editing approach using machine learning: an application to VAT data

Benjamin Vasquez

Paper

Slides