As JFK said about going to the moon, "We choose to go to the moon in this
decade and do the other things. Not because they are easy, but because
they are hard.", the same is true of customer and product integration and mastering. MDM as a technology and a process is not easy it is hard but it provides so much value in the end that it is worth the journey to achieve.
The challenge of MDM is focused squarely on creating a technical solution which enables the business to automate the process of matching customer and products into a single master list. This can take significant effort to get to the point where the rules you defined for matching are meaningful and effective.
The project I was involved in required us to create a solution which was cost effective and did not include the use of a matching product like DataFlux or Trillium but was based in the database and ETL tool. Our database of choice was Oracle which provided some SQL extensions to support matching. We implemented the matching within an ETL tool (Talend) which further extended our capabilities which we had in the database. A number of functions were consider and the following Oracle functionality was used in our cleansing and matching approach:
- Regular Expressions where used to find patterns and remove and alter to enable a standardization of names and addresses
- Equi-joins and other join types to match
- Soundex or Metaphone function in combination with other matches to enable fuzzy matches
- Jaro-Winkler, Levenshtein and Distance functions for fuzzy matching
- ETL Tool Functionality which further extends the base database functionality
I will be presenting this solution at COLLABORATE13 in Denver in April, and this entry should help you as you consider an alternative approach to matching which will be critical to your MDM solution.
No comments:
Post a Comment