Samuel Roth, MBA ’19, wrote this on behalf of his team in Olin’s Center for Experiential practicum program.
The MilliporeSigma team received more than 50 disparate data sets with tens of thousands of rows of data—each ranging from customer interaction logs to water quality measurements to technician feedback logs. The team has been asked to take the data and answer a seemingly simple question: For lab water purification system-customers, when are service events likely to occur and what are the primary indicators of an imminent service event?
From a business school mentality, the team, consisting of four master of customer analytics students and two MBA students, initially wanted to organize the data to create a model that would maximize economic benefit for MilliporeSigma. However, the client noted that the team needed to approach the problem without bias toward organizational objectives.
Team members rolled up their sleeves and began analyzing the data, only to find discrepancies in records that defied human understanding. How could the data indicate a technician made a repair on a machine that had never been installed? This realization led the team to realize every piece of data included in the model had to be rigorously scrutinized for its reflection of the real world.
Painstakingly, the team cleaned, examined, and again cleaned the data to avoid the phenomenon of “GiGo”—garbage in, garbage out. The client pivoted its expectations upon recognizing how much work was required just to prepare the data. The new measure of success: Simply creating a file that provided clean enough input for machine learning models to analyze.
Exceeding expectations, the team produced a file that is machine-learning ready with four weeks remaining to derive insights from statistical learning models.
The team has endured major pivots at nearly every turn in the project and has come to recognize that this is how business is done. MilliporeSigma and the CEL have provided the team an amazing opportunity to not only apply ivory tower modeling techniques taught in academia, but also to experience first-hand how challenging it is for organizations to patch their data together and provide insight into the real world.
Pictured above: Nithin Tiruveedhi, controller, BRM and diagnostics, MilliporeSigma; Robert Woody, MSCA ’18; Claire Xu Yiwen, MSCA ’18; Samuel M. Roth, MBA ’19; Seungho Oh. MBA ’19; Leah Zhang Chuyi, MSCA ’18; and Kunnan Liu, MSCA ’18.