Skip to main content

Measuring geographical and population coverage in CPI internet price collection: An application with groceries web scraping in Italy

Languages and translations
File type1

Consumer price indices (CPIs) are instrumental in the development of monetary policy and in monitoring economic developments. Prices collection for CPI compilation has come a long way in the past 20 years. However, while ideally, the index should include expenditure made by all households, urban and rural, throughout the country, CPIs in various countries have limited geographic coverage both for price collection and consumption expenditures. The introduction of new data sources, such as web scraping and scanner data, have contributed to reduce price collection costs and increase the reach across national territories, thus allowing to enhance the accuracy and quality of the CPI. The aim of this paper is to suggest a finer measurement CPI geographical coverage based on geostatistical fuzzy indices that would be particularly useful in cases where prices vary substantially across space, as it is proven that consumers only travel within limited extents for their purchases and a sparse network of outlets may lead to biased measurements. To explore the potential of the suggested measure we estimate relative price levels across regions for a time period and price changes over the period for each region region-time-dummy method. This analyses is further validated by referring to structural breaks in our coverage metric and in spatio-temporal CPIs. Using a dataset deriving from geo-localized groceries web scraping in Italy, we provide a practical application calculating coverage at a regional level adopting different functional forms. Our findings corroborate the robustness of the proposed coverage metric and allow to embed information on geographic coverage in price statistics.