Web Scraping of Prices of Commodities Included in the Generation of Consumer Price Index (CPI) for the National Capital Region (NCR), Philippines

Online stores are becoming popular as a new platform for business transactions, not only in the country, but also globally. To take advantage of this new approach, the Philippine Statistics Authority (PSA) started in 2019 the exploration on the use of web scraping as an alternative collection method for prices of commodities included in the computation of Consumer Price Index (CPI) for the National Capital Region (NCR), Philippines. Currently, the PSA uses the traditional face-to-face price collection of commodities from sample outlets or stores. In this paper, prices collected from traditional method or face-to-face method are called offline prices, while web scraped prices are termed online prices. Prices of 514 commodities are web scraped, which comprise about 71 percent of the total commodities in the CPI market basket of NCR. This study aims to determine if offline prices can be replaced by online prices or by a combination of online and offline prices (hybrid) in computing the CPI for NCR. Results show that the behavior of online and offline prices are comparable for selected commodities that are not highly volatile such as clothing items. However online prices of agricultural commodities, which are highly volatile, do not present the same trend of volatility as that of offline prices. Moreover, for CPI computation, offline prices are more appropriate to use for certain commodity groups, while for others, hybrid prices.