Learn data science best practices

Lessons From a Canceled Data Science Project

There is a lot of excitement around data science and its application in solving business challenges. However, given the experimental and iterative nature of data science, ROI is typically viewed with a level of skepticism. Results matter and that is particularly important for data science efforts successfully drive business transformation. So how do you increase the odds of success?

At a previous organization, I worked on developing an automated method for sorting fruit based on firmness in order to increase the economic value of the harvest. The traditional method involved manually puncturing a few apples in a load and measuring their firmness. Based on the results, the load would be priced accordingly. Ultimately, our project was canceled and I had to learn some valuable lessons the hard way. In this article, I share some principles that may help prevent your next data science project from being canceled.

The Project

As with any data science project, the right data set had to be collected. Since apples don’t leave behind any digital trails, we developed a device to take measurements. The device passed sound waves through an apple like a sonogram. After many iterations and statistical models (even some neural networks), the key determinant of apple firmness was identified. We even performed some good old fashioned finite element 3D mechanical modeling to validate our results. We were able to put an apple through the device and determine its conditions.

While this was all great science and engineering work, we discounted issues of how the actual device would operate in real conditions outside of the lab, which we quickly discovered in the testing stage. We faced two challenges once we tested the device in real-life situations and started sorting. First, each apple had to be held for a moment in the device to make a measurement. Imagine the physical limitations of millions of apples passing through a line and needing to be measured. The final challenge was tougher to overcome. For the device to make a good measurement, it had to squeeze the apple hard enough to make a good measurement which bruises the apple. This defeated the whole purpose of our project! In retrospect, we should have known this since sonogram devices at the doctor’s office also have to be pressed hard to make a measurement.

We pressed on and focused on making our analytical models more accurate. After all, the prototype and models were working and measuring things correctly. We did not communicate with the process and manufacturing engineers despite our discovery, and assumed that they would figure out a way to make it all work. Had we included more diverse expertise in the team, I am confident that we would have come up with a more practical solution.


Many projects start with good intention and great goals, but somewhere in the middle we tend to get too focused on the activities and technical challenges instead of results. An area that causes pain is when these projects and ideas are to be implemented in real life. Typically, the challenges revolve around technical implementation and business process changes and the complexity of managing them. Keep the following principles in mind and you will have much better success on your data science projects.

  1. Keep the main objective in mind and don’t ignore early warning signs.

  2. Consider the end-to-end process and validate against the whole as more is learned.

  3. Collaborate and listen to functional teams, as the diversity and expertise of others help reduce overall risk.


Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.