Scanner data has been used in production to compute the HICP and the CPI for France since January 2020, for most of French retailers. The current methodology with this data uses a product referential bought from an external provider giving us detailed characteristics for each article. These characteristics allow us to match articles in our data with the COICOP and to create homogeneous groups of articles. Thanks to this information, we can compute a unit price value for each group of articles and each month. The following steps in the methodology are very similar to the process with field-collected data: we select a sample of observations that will be used to compute price evolutions and aggregate them using a geometric Laspeyres formula at the lowest level. And as with field data, replacements are made for unavailable products. This choice is quite specific to France whereas the use of multilateral methods is more widespread in other countries. But recently, new retailers (hard discounters) have started to implement a data flux to provide Insee with their scanner data. The specificity of their data is that most of their articles aren’t covered by the product referential, which makes the current methodology hard to apply at first sight. In this study, the goal is to be able to use these new scanner data in the following years. We will address two questions to do so. First, relying on the other INS experiences, we will test the generalization of multilateral methods on our already received and used scanner data to document on a large scale the behaviours of such methods in the French context. This experience is an opportunity to gain practical and theoretical skills with known data. Second, we will present a strategy to make use of these new scanner data with these methods. Given the raw data of these retailers, the process to be developed goes from classifying the products to integrating the computed indexes in our main process that produces our French CPI.
Languages and translations