Footfall signatures research wins best paper prize

Nikos Ntounis shows off our best paper prize at this year's AM conference
Nikos Ntounis shows off our best paper prize at this year’s AM conference

Our new £1m Innovate high street and retail project may have just started, but the research underpinning our successful bid for the £1m ‘bringing big data to small users’ project has been awarded a ‘best in track’ prize for retail at this year’s Academy of Marketing Conference, held at Newcastle Business School.

The research identified new footfall signatures and town types the team had found in their preliminary analysis of footfall data, provided by Springboard, who are leading the new project.  The findings were presented in a competitive paper “Radical Marketing and the UK High Street: Towards a New Typology of Towns” authored by Cathy Parker, Nikos Ntounis, Simon Quin and Ed Dargan.

Radical changes in the retail environment, such as the proliferation of online shopping and the advent of omni-channel retailing, are putting immense pressure on the UK High Street and town centres. The aim of this study was to examine, after many years of mono-functionality focused upon retailing, and with the shift of some of this activity to the Internet, how UK town centres and high streets are actually adjusting to this change. The research examined footfall data from 50 UK towns over a 30-month period. The findings suggest that a new typology of town centres based on footfall signatures instead of their position in the traditional retail hierarchy was feasible. The authors provided rationale for this ‘new’ typology of town centres by extending Bucklin’s product/retail classification to marketing channels. Finally, the team proposed that new multi-functional town centres could really benefit from using activity levels like footfall as key performance indicators, rather than relying on more static measures such as the amount of multiple retailer floorspace.
The research team has been invited to submit a full version of the paper to this year’s special Academy of Marketing issue of the Journal of Marketing Management, which will contain all the outstanding research from this year’s conference, and will be published in 2017.

The 39 steps – to understanding High Street performance

img_1718This month our new Innovate project started. The project will bring big data to town and city centre decision makers, enabling them to optimise footfall whilst also improving the experience of centre users. The first stage of the project (running from now until Spring 2017) is very research focused.  Because we have over 9 years of hourly footfall data, courtesy of the project lead Springboard, the research team at the Institute of Place Management (Manchester Metropolitan University) and the University of Cardiff can really start to work out how and why town and city centres perform as they do.  Our findings will then be incorporated into a place management information system and a serious of dashboard products, built by our technology partners MyKnowledgeMap.

These new products will support decision making in towns and cities, by making important data more readily available and more easily accessible to the wide range of stakeholders who need to collaborate to build strong centres.

One of the challenges with big data is what to do with it.  It may seem an obvious starting point, but first the research team have had to identify a definitive list of research questions that we want the data to answer. The Principal Investigators for both the IPM/MMU team and the Cardiff team (Cathy Parker and Christine Mumford)  met in August to compile such a list of research questions (39!) that we will be answering over the next few months and we are sharing these here.  As always, any comments, observations or feedback is most welcome.

RQ1: Are the distinct town types (comparison, specialty, convenience/community) recognisable in a bigger data set?

Preliminary research strongly indicates the existence of distinct footfall signatures. But these were originally identified in our pilot data set of 50 towns, using footfall data that ended in 2014.  Now we have more towns and data spanning 2006-2016 can we find additional evidence of the town types we originally identified? If so we will conclude the typology is robust – in other words it is generalisable to a bigger sample.

RQ2: Are there other signature types present in the data?

With more data we may find more signatures. Are there other signature types we should be including (such as holiday towns, that have an August peak in footfall)?  Our recent invitation to present our High Street UK 2020 research findings to the Withington Civic Society prompted a lively discussion about the profile of centres with a large student population.  Is there a recognisable ‘university town’ profile? We will find out!

RQ3: How are signature types defined?

What makes a signature distinctive? For example, when is a town a comparison shopping town? In our previous research, we identified comparison shopping towns by those that display significant ‘January drops’ (a reduction in footfall after Christmas). But how much does footfall have to differ between December and January to warrant the ‘comparison town’ label. We need a more scientific method to define signature types.

RQ4: How many UK retail centres have (or have had) a recognisable monthly signature type?

Once we have established a reliable method to identify town types we can then find out how many UK retail centres have, or have had, a recognisable signature. In other words, what type of towns have we got in our sample? Can we find evidence of towns changing type – or are town types comparatively stable over time?

RQ5: How do the monthly signature types differ by week, day of the week and hour?

Our classification of town types will include relationships between hourly, daily and weekly variations – thanks to all the footfall data Springboard have provided to the research team. For example, do all towns show a similar pattern of footfall throughout the day, or are certain hours busier in certain types of towns? If so, this will help towns manage their activity hours – and encourage stakeholders to make sure their offer is open when the catchment needs it – one of the most important drivers of town centre vitality and viability identified in our last project.

RQ6: How well does our original HSUK2020 model predict footfall?

In our HSUK2020 project we developed a model that could predict average monthly footfall in a UK location, within +/- 10% or so.  However, we only tested the model in 15 locations, for which we had historical footfall data. We need to validate our model against the bigger data set. Being able to predict footfall will be really useful to the retail and retail property industry – helping them to assess sites, as part of their location decision making, where this information doesn’t currently exist.

RQ7: Can we refine original HSUK2020 model to improve forecasting ability?

Following on from RQ6 – depending on how well our original model performs, can we improve on it?  Are there other variables we should be including in the model that will make it more accurate?

RQ8: Can we (should we) develop a more accurate catchment predictor?

There are plenty of providers of catchment and shopper catchment information – but often these do not take into account factors such as the touristic appeal of the town.  We will compare the forecasting ability of our revised HSUK2020 model to existing catchment and shopper catchment data to see which perform better (in other words which methods forecast actual footfall more accurately) – and under what circumstances.

RQ9: What is the relationship between the amount of footfall and town types?

Is there a relationship between the amount of footfall and town type? In other words, do all comparison shopping towns have the largest amount of footfall. Conversely, do all convenience/community towns have the smallest footfall?

RQ10: If we build a hierarchy of towns by size of footfall how does this compare to existing hierarchies?

To what extent do our town types and footfall data correspond to existing typologies or hierarchies? To what extent do the signature types we find in the data correspond to existing perceptions or current decision-making, plans and strategies?

RQ11: What is the influence of location? 

Is there any recognisable pattern to the location of town types? Does the ‘north/south’ divide we see in other retail statistics (e.g. vacancy rates) exist in relation to footfall and town types?

RQ12: Which of the 25 priorities for improving the vitality and viability of high streets can be operationalised through the use of existing secondary data?

In our High Street UK 2020 project we identified the 25 priority factors town centre partnerships should concentrate on if they want to improve footfall (e.g. opening hours, diversity of offer, walkability) To include a factor in our analysis, modelling or forecasting in this project we have to be able to turn it into a number. How do we find data on each of the 25 factors, at the level of the town? Does this data exist?  Is it reliable? Free to access?

RQ13: Which of the 201 factors can be operationalised through the use of secondary data?

The High Street UK2020 project didn’t just identify 25 priority factors that influence the performance of the High Street – the aim of he project was to find all the factors that influence high street performance. Overall, the project identified 201 different factors. Just as with the 25 priorities mentioned above, to include a factor in our analysis, modelling or forecasting we have to be able to turn it into a number. How do we find data on each of the 201 factors, at the level of the town? Does this data exist?  Is it reliable? Free to access?

RQ14: Which of the 25 priorities that CAN’T be operationalised through secondary data do we want to collect primary data for?

For many of the 25 priorities, we will not be able to get hold of relevant data that is publicly accessible and/or reliable.  For example, ‘networks and partnerships with council’ was identified as an important priority for place partnerships, but is unlikely that data already exists that tell us the quality of the networks and partnerships with councils across all the towns and cities in our sample. We can use our literature review to help us make sense of these priorities.  For example, for ‘networks and partnerships with council’ we will need to establish the “presence of strong networks and effective formal or informal partnerships”. Whether “stakeholders communicate and trust each other”? And if “the council can facilitate action (not just lead it?)”.  To get this information we will need to undertake some survey research.

RQ15: Which of the 201 factors that CAN’T be operationalised through secondary data do we want to collect primary data for (e.g. those in the ‘top 20’)?

Just like with the 25 priorities that improve high street vitality and viability, we may not be able to turn each of the 201 factors that influence high street performance into a number we can include in our testing, very easily.  It is unlikely we have the resources to find ways to turn all 201 factors into measurable variables for our analysis. Therefore we will have to agree a way of prioritising these factors – perhaps by just concentrating on the factors which had the most influence on town centre performance?

RQ16: What are the best ways to visualise significant relationships both academically and practically for towns and other partners to engage and understand.

We know from presenting the findings of HSUK2020 to a diverse groups of High Street stakeholders, in locations across UK, Europe and beyond that using visuals and infographics to communicate our research findings are very important. The research team will need to explore creative ways to bring our results to the widest audience possible, during the lifetime of the project, to facilitate knowledge exchange and also keep our funders happy!

RQ17: What other (non-footfall) measures of town centre performance can be identified?

So far our research has relied upon footfall, and footfall will remain our main town centre performance indicator in this study.  But footfall is not the same as retail spend (even through the two are related) and so we will be exploring the relationship between footfall and other factors and retail sales.  But there are other measures of performance that we may want to use.  For example, sentiment analysis of social media entries may show the experience visitors have of a specific town. Any other measures of performance we may incorporate into the project have to be reliable and freely available if we are to consider them.

RQ18: Can we build a model of town centre performance?

If we have footfall and other measures of performance (such as retail sales and customer experience) and we have tested all the factors that influence performance, will we then be in a position to build an exhaustive model of town centre performance?

Such a model could revolutionise town centre decision making – taking a lot of the uncertainty away from decisions such as what type of development will improve performance – or whether or not increasing car-parking charges will deter customers.

RQ19: What is the baseline performance of pilot towns (footfall, retail sales and customer experience)?

One of the aims of the project is to trial a number of collaborative activities within towns – and measure the impact of these activities upon performance – so baseline measures will need to be established in each of the 7 pilot towns in the project (Ayr, Ballymena, Bristol, Congleton, Holmfirth, Morley and Wrexham).

RQ20: Is there a relationship between baseline performance and model of town centre performance?

How well does our model of town centre performance estimate the baseline performance of the 7 pilot towns (Ayr, Ballymena, Bristol, Congleton, Holmfirth, Morley and Wrexham) in the project?  This will be a good way of testing the model of town centre performance we identified as a result of RQ18.

RQ21: How do we classify and measure collaboration activities?

Our 25 priorities for improving vitality and viability of high streets are a good starting point.  We know from the primary research undertaken in the 10 HSUK2020 towns there was quite a discrepancy between what the experts thought could be achieved, and what the town stakeholders thought they could achieve – and most of this discrepancy was due to the difficulty with which different stakeholders can collaborate.  The example we heard, time and time again, related to how hard it was to coordinate opening hours across both multiple and independent retailers. Should we classify collaboration activities based upon how much impact they are likely to have (high, medium low?) or by the time-scale needed to achieve them (short, medium long)? Or by a combination of these? It is likely partnerships will want to focus on the collaboration activities that will bring the most reward for the level of effort required (see RQ23 below).

RQ22: What are the relationships between collaboration activities and performance (individual trader and collective town) in pilot towns?

Each town will trial a collaboration activity – and because we will have baseline data and be able to control for other factors, such as the weather, we will have hard evidence of the impact of the collaboration, both in terms of benefit to the individual trader (retail sales) and the benefit to the town (footfall).

RQ23: What collaboration activities are associated with higher levels of performance?

As we are going to trial a number of collaboration activities (7) then at the end of this stage of testing we will be able to tell which ones had the most impact, and why.

RQ24: Can we identify distinct movement signatures in tracking data?

We know there are different footfall signatures associated with different town types – but are there different movement signatures in any tracking data we can analyse in the project (the exact nature of this data has not been finalised as yet)? In the same way we identified the town types from the data (inductive) – does any data we analyse that relates to an individual’s movement in a location ‘throw up’ any particular patterns? See caveat about tracking data below.

RQ25: What performance indicators can we deduce from the tracking data (e.g. dwell time)

Movement or flow information deduced from tracking data may deliver more performance indicators, like dwell time, if we are able to track specific users in a location.  There are lots of privacy and other issues associated with this type of data though – things we don’t have to worry about with footfall data – so we will consider the pros and cons of using this data carefully before we make any firm commitment.

RQ26: Is there a relationship between movement signatures and footfall signatures?

We know there are different footfall signatures associated with different town types, but does the town type also affect how people move through the town? Do people ‘explore’ speciality towns but stick to the beaten track in comparison centres, for example?

RQ27: Do we need to establish a composite signature based on footfall and movement?

Can we combine the movement and footfall data to better identify town types? Do we get a richer typology (e.g. more town types) or more robust typology (explaining more towns) if we do this?

RQ28: How do we best visualise performance (footfall, sales, customer experience, dwell and any other performance indicators)?

What is the best ‘set’ of performance indicators for a town to use?  What does good performance look like?  Can we develop measures of performance that are simple (as lots of people need to understand them) but meaningful (they tell us something useful)?

RQ29: Can we identify towns that show unusual or inconsistent performance behaviour (such as sudden drops or rises in footfall)?

First, we need to know if there are any problems with the data supplied.  Then, where towns experience sudden drops in footfall and the data is correct, then what other factors could explain this?

RQ30: Can we identify towns that show unusual or inconsistent performance trends (such as much weaker or stronger performance over time)?

Which are the towns that show strong or weak performance? What towns show inconsistent performance (e.g. strong decline followed by strong improvement)?

RQ31: Can we explain unusual or inconsistent performance trends and/or, where appropriate, develop hypotheses to explore these further?

Why might towns exhibit unusual or inconsistent footfall or trends in footfall? What factors might explain unusual or inconsistent trend lines in performance?  Can we test out these assumptions on other towns in the data set?

RQ32: Can we decompose data set into similar groups (based on footfall, based on retail sales based on retail sales per footfall and based on customer experience and based on customer experience per footfall). Establish relationships between other performance indicators and footfall.

What might a more nuanced retail hierarchy look like?  One that takes into account footfall, sales and the customer experience? What are the relationships between the performance indicators?  Is the relationship between footfall and sales direct and linear? Is the relationship between footfall and customer experience similar?  Or is there a point at which increases in footfall lead to decreases in customer experience?

RQ33: What parameters will we use to establish optimisation of performance (e.g. town type, town size?

If we are going to compare towns and encourage them to optimise their performance, then how should we compare them (e.g. town size, town type) and what performance indicators should they be optimising? Given the nature of the project arguably customer experience?

RQ34: Can we identify top 50 towns that optimise performance?

Once we have agreed how towns should be optimising their performance, which ones are doing best?

RQ35: What additional resources will be needed to commercialise the data analysis we are prototyping?

This project will pilot methods and develop prototype tools for towns to use and test.  In order to commercialise the findings of the project much of the data collection, analysis and software will need to be scaled up for commercialisation.  The project team should identify, as the project develops, where and how various processes can be improved, in preparation for developing and launching a fully commercial set of products.

Additional Research Questions

Additional research questions can only be added if they can demonstrate a potential influence on the successful commercialisation of the end project.  Some examples of this might include the inclusion of factor that were not included in original HSUK2020 201 factors which are known to influence footfall. Or ways in which to extend the project findings to include smaller towns that do not currently measure footfall.

RQ36: How does the weather influence footfall?

Footfall is dramatically influenced by the weather – but it was left out of the HSUK2020 study, because there is very little literature on this AND the towns did not identify the factor either.  Probably because it is so obvious it was missed by everyone!

RQ37: Is post-Brexit footfall lower, allowing for the weather conditions?

It is very important we can ‘control’ for the weather (by being able to explain the amount of variance in footfall attributed to various weather conditions).  For example, the fall in footfall post-Brexit has been attributed to the poor summer weather. We think being able to give an answer to this topical research question (i.e. has there been a ‘Brexit-effect’ on footfall?), early on in the project, will raise interest and awareness of the project with the media and a wider group of stakeholders, which will help us with dissemination.

RQ38: Can town types be identified through ‘partial’ footfall data?

If the findings of the project can be shown to be relevant to smaller locations, they are more likely to make the investment into footfall data and the products that are being developed in this project, if the data and products can be shown to be relevant and significantly improve decision making and centre performance.

If we can enagage more smaller locations to engage with the project (through the user group structure), by establishing their identify through partial footfall data (gathered by hand), it will improve our chances of making products are relevant to their needs, at commercialisation stage.

RQ39: Can a measuring methodology be developed so that towns know when to count (hour/date etc.) to identify their town type?

Manchester City Council have joined the policy user group of the project and want to know if and how the 13 district centres around Manchester are classified, so they can develop relevant policy in those areas. Can we develop an easy-to-use set of instructions so that volunteers can collect footfall data in the 13 locations and get an idea of their likely town type?

Note: Outside of the 7 pilot locations (Ayr, Ballymena, Bristol, Congleton, Holmfirth, Morley and Wrexham) no towns will be identified by name when we disseminate the research findings from the project.

Can bad data be good data? Reflections upon the Consumer Data Research Council Partner Forum

Guest blog by Ed Dargan, PhD Student at the Institute of Place Management, Manchester Metropolitan University

The Consumer Data Research Council (CDRC) (established by the ESRC) held the CDRC Data Partner Forum on the 6th May at the Saïd Business School, University of Oxford. The key aim of the CDRC is to help organisations maximise the potential of innovation by opening up their data to trusted researchers so that they can provide solutions that drive economic growth and improve our society. During the day, the presentations were based around three themes of missing data, data sources and research design.

For the retail demand modellers, the inclusion of seasonal demand, especially for seaside locations, being able to account for natural barriers and include travel times based upon real journey times were seen as important. It was useful to see how different data values were being clustered to form classifications, as this is something that needs to be done with the footfall data available to IPM, in the big data project we are just about to start with Springboard.

The importance of data representation was an important theme. Missing data, both spatially and temporally was identified as a challenge and a number of techniques were identified to ‘fill-in’ missing data. A recurring theme was the problem of using time constrained census data when analysing concurrent data that is updated more frequently. Also identified was the accuracy problem of end-user supplied outcome codes, in this case failed delivery reasons.

With any spatial and temporal data, there is the challenge of providing a digestible visual display. With so much data available, this was acknowledged as a challenge that most of the presenters using geographical mappings faced.

As a data source, supermarket loyalty cards were discussed. Interestingly, it was found that loyalty card usage was least likely to occur for small and frequent purchases, no matter what type of store was visited or the socio-demographic classification of the customer. The map of users of a store showed a more dispersed geographical spread around the UK than expected. This highlighted the problem of customers failing to update their home address details when moving home and the subsequent difficulties in interpreting loyalty card spatial data.

However, when problems in the data were identified, this fed into the recurring observation that so called bad data, that is data identified statistically to be problematic, should not always be removed or cleansed using missing data techniques. Alternatively, this so called bad data could be the most interesting data of all for a researcher and/or commercial organisation. For example, people who don’t update their loyalty card details could lead to some very useful insights into such customers. Perhaps they are a very profitable segment?

Useful resources identified during the presentations included: http://maps.cdrc.ac.uk which includes views of geodemographic, retail and general metrics for the larger towns and cities. Various views are provided, one that seemed a useful barometer of high street health was the retail view which for some towns (presumably only a few have the data available) provides changes to retailer types and vacancy rates over a set period of time.

Overall, it was a very good day. The presentations were very interesting and there was also the opportunity to meet and mix with other academics and business representatives.

About Me (Ed Dargan): For the last 10 years, I have worked in multi-channel organisations. In 2012, I started an MSc in Internet Retailing at Manchester Metropolitan University (MMU) and it was here I met Professor Cathy Parker who led the marketing strategy module. I support local food producers and retailers and Cathy provided the opportunity to take a look at some footfall data for a number of locations throughout the UK. At the time, the data sample was too small to investigate statistically but the initial view of the data revealed some interesting monthly patterns. I had the choice for my MSc dissertation of either pursuing the footfall data or look into internet options for local food retailers and producers and I took the latter option. However, my interest in the footfall data remained so when the opportunity came to investigate the footfall data as part of a PhD, after a slight hesitation and sanity check, I grabbed the opportunity. My PhD is part-time and I’ve just finished the first year researcher course at Manchester Metropolitan University and am raring to get going exploring the data.    

Below is a list of the sessions and presentations:

Session 1: Missing Data and Missing People

Thomas Waddington: Modelling the temporal variation in supermarket revenue estimates

Eusebio Odiari – Infilling missing values in consumer Big Data

• Michail Pavlis – The geography of non-delivery

Emily Sheard – Enumerating the ambient population in the context of crime

Guy Lansley, Chrysanthi Kollia – The spatio-temporal geodemographics of youth

Session 2: Novel Data Sources and their Geographic Integration

Nik Lomax and Martin Clarke – Home owner mobility: assessing distance and geodemographic consistency using consumer data

• Hai Nguyen, Oliver O’Brien – naming conventions and ethnicity

Guy Lansley, Wen Li – Areas and activities: integrating consumer registers

Alyson Lloyd, James Cheshire, Roberto Murcio – How representative are high street retailer data?

Anastasia Ushakova – Temporal patterns of energy consumption and vulnerable consumers

Tim Rains – Data linkage of store loyalty cards

Session 3: Big Data and Research Design

• Alex Singleton, Bala Soundararaj: Dynamic high streets – SmartStreetSensor

Mark Birkin – Spatial microsimulation, big data and policy analysis: an example from the UK travel market consumer data

Phani Chintakayala – Do green attitudes and demographics drive sustainable product consumption?