This Guidance illustrates how to use AWS Glue machine learning (ML) transform and AWS Lake Formation FindMatches to harmonize, or de-duplicate, customer data from different sources. In today’s digital world, data is generated by a large number of disparate sources and growing at an exponential rate. Companies are faced with the daunting task of ingesting all this data, cleansing it, and using it to generate customer insights. This Guidance provides an ML-based probabilistic approach to help you get a complete customer profile and provide a better customer experience.