To make the UNFI & KeHE business health dashboards easier to use Crisp makes the following enhancements to the source data:
- Cleaning up store and chain naming to improve readability.
- Geocoding stores to provide a more accurate address.
- Consolidating duplicated data from the source to reduce redundancy.
Data transformation details
We also performed the following actions to enhance the data:
- Manually inspected and cleaned anomalies/ambiguities:
- isolated blank or “unknown” UNFI chain codes
- identified independent stores
- stripped special characters
- normalized common name variations (e.g. “MKT” becomes “Market”)
- assigned each store a unique key so that duplicates could be removed
- Parsed chain and store names from store description, extracted store numbers from store description and placed them in separate field
- Data referencing “Sprouts #1 FZN” and “Sprouts Farmers Market 1” both get identified as store name “Sprouts Farmers Market” and store number “1”
- Geocoded all stores to give them a more accurate address
- Match store data against publicly available data on the web (grocer websites) to fill in blank address or store number information
- Reformatting store names whenever possible to unify naming conventions across all connected data sources
Since we've enhanced the geolocation data, you may see differences from the source data for metrics that rely on geolocation data (namely, POD counts for certain chains/banners). Also, POD count is an input into our velocity calculation so you may also see differences in your velocity numbers compared to the source data. In most cases, if your velocity does differ, it will go up rather than down.
The following example demonstrates how POD counts may differ and how it would impact velocity.
Note: This example is for demonstration purposes only and may reflect changes that are greater than the norm.
KeHE connector looking at the last 52 weeks of sales:
Sales quantity = 619981
Sales amount = 1140192
Current store count = 2677
Current PODS (unique store product combination) = 8344
Current Unit per store per week (velocity) = 619981 / (2677 * 52) = 4.45
New store count = 1541
New PODS = 6419
New Unit per store per week = 619981/ (1541 * 52) = 7.73