This week, we announced the launch of RetailWatch, Europe’s first satellite-based car counting capability. This blog explores seven key aspects which have driven our design of it, and I hope will inform your use of it.
We’re excited about this data-set and hope you are too – feel free to request a sample at retailwatch@geospatial-insight.com. The data-set will look familiar in some ways, unique in others. This blog explores seven key aspects which have driven our design of it, and I hope will inform your use of it.
We are a European company first and foremost, and financial services organisations in our region have been asking for European data. Satellite data to assess consumer trends has been common for about a decade in the United States. There are good reasons for that. US markets, boosted perhaps by economies of scale, have enjoyed alternative data for longer than in Europe, with satellite at the forefront courtesy of UBS’s game-changing work on Wal-Mart a decade ago. To our knowledge, there has never been a European data-set and we have taken many requests to provide this, partly because privacy issues (rightly, in my view) mean other popular data-sets, such as transaction data and geolocation data, come with privacy compliance concerns. In addition, building a European satellite data-set is less trivial, in part for reasons of financial signal complexity within retail, but also the compilation of it given satellite trajectories and weather occlusion.
Not quite. Europe is not the US. Relationships between US strip mall parking lot activity and retail performance is more obvious given the relative predominance of US out-of-town shopping. European retail is quite different, with greater city centre (downtown) use, the prominence of privately-owned retailers such as the German discounters and southern European cooperatives, and of course the increasing uptake in internet shopping. However, as some have told us, the complexities offer opportunities of nuanced interpretation, which has affected the data we are delivering in this first release, also how we plan to augment it in the near-term. We also appreciate that in some cases, the data will be used in conjunction with other consumer/retail data-sets, for validation for example.
The initial release of the data-set is raw data, featuring Europe-specific car count information for 14 market-capitalised (i.e. have shares traded) store groups alongside key supporting data such as time-stamp, financial identifiers, location and area. Data is both live and historical, dating back to 2015, accessible from csv files and over an API. Why 2015? Prior to 2015, European satellite coverage was less straightforward, hence why it’s been hard for firms to build data-sets, and the exercise continues to be non-trivial. Backtesting, for us at least, has been more satisfactory since 2015. We’ve kept to raw time-stamped data, primarily to reduce bias. What you see in the data is what you get, not what someone has massaged to try and make it more presentable. More stores will be added in the future.
Yes, it is cloudy in Europe, particularly in Northern Europe and during the colder months. When testers have examined our early data-sets, adjusting to account for seasonality has been a key part of their preparatory data normalisation and analysis.
A key data attribute in the set is car park area and we recommend you normalise the counts against area. To be sure, identifying car park areas can be challenging in shared car parks, of which we have many in Europe so if there is methodological bias, here is where you will find it. However, car park areas are of course consistent with the car counts inside them.
Our data and machine learning specialists in our Development Team can talk about this for hours. Here is my simple abridged version. First, we acquire the images, or rather we identify key images on a cloud processing system. On some occasions, we assess multiple images (or ‘chips’) to build an aggregate view of a scene. We use a deep learning technique to identify the cars in that scene, extracting a geolocalised vector data-set, batch up with all the other image-derived vector datasets, then counting. This process is fully automated, sped up nicely courtesy of GPUs, with an inbuilt QA procedure to monitor and ensure accuracy. Many financial service applications claim ‘AI’ or ‘machine learning’ technology stacks which, when you dig further, often entail little more than running a regression within an automated process. In our case, machine and deep learning algorithms lie at the heart of our calculation.
Yes and No. Primary feedback has come from Financial Services clients, both those with immediate alternative data interests and others with longer-term methodology interests. We see this first product incarnation sitting within classic ‘alternative data’ universes. However, the data is useful for anyone with interests in retail, from councils to marketing agencies to retailers themselves. Part of Geospatial Insight’s business is focused on urban planning, so while the look and the feel of the data may need the help and interpretation from this team, the data can be used in this domain.
This is a new data-set. Feedback on the data itself, however critical, will of course be welcome. This is the start of a journey, not an end, and we will continue to adapt and evolve this data-set – and others which we’re bringing to market - based on your feedback.
To see a data sample, email retailwatch@geospatial-insight.com.