Global auto firm identified ~185K customers by linking decentralized data via R-scripts
08 Apr 2020
1 min read
We helped a global luxury car brand in integrating customer data from multiple sources by using cutting edge tech and processes.
Context
- Client Description: A large global automotive player
- Opportunity: The client wanted to re-construct its Indian customers’ data into a single centralized platform from the existing multi ended decentralized platforms
Our Approach
- The activity was bifurcated into two processes: (1) Cleansing: Existing data stored in different platforms was cleaned and standardized in the desired output, (2) Merging datasets: Identity of customers was established based on combinations of different data fields in lieu of the absence of primary key
- R (programming language) was used to run the different cleaning & merging logics
- Existing data were diligently observed to find out the patterns. These patterns were analyzed to define the cleaning logics
- Existing datasets were bifurcated into smaller sub-datasets for the effective running of code. Customized logics could be defined for each sub dataset


