Swense Tech

Best Solution For You

The Data Science Struggle: Why Organizations Routinely Fail To Realize The Full Potential From Their Data Science Efforts

By Hannah M. Mayer, Luca Vendraminelli, Timothy DeStefano

One of the world’s toughest challenges during the height of Covid pandemic was the timely allocation of vaccines to the people and areas that needed them most. Italy, like so many other countries, worked hard to address this daunting optimization problem. One major Italian city decided to introduce a web portal to trace all vaccine movements across the supply chain in an effort to rapidly deliver jabs to its one million residents. Leveraging this data, the system autonomously allocated vaccines to distribution centers and automatically assigned appointments to citizens – an initiative pegged as a cornerstone of the local vaccination campaign. However, the AI solution was never used at scale and thus failed to deliver real value to stakeholders. What happened?

This failure of data science to solve a critical problem is hardly an exception. Research by the Laboratory for Innovation Science at Harvard has demonstrated that only a portion of data science projects across a wide range of organizations, such as automotive, biotech, retail and the public sector, solve challenges and lead to at-scale performance gains. Why do so few succeed? The research shows that the culprit is usually one or multiple of four common pitfalls, all of which the Italian city fell prey to.

1. Absence of a specific, relevant problem to be solved

One shortcoming at the onset of many data science efforts is the lack of proper framing of the problem data scientists should solve, resulting in able talents applying their technical skills to a wide array of questions, many of which may not be relevant to users. Data scientists oftentimes start the innovation process by running regressions rather than listening to users. However, what is needed is a clear and concise problem-framing activity, ensuring the solutions that are envisioned have maximum relevance to the end user.

The Italian city is a case in point. The main challenge was not the technical difficulty but the lack of definition of the problem to be solved. Without concrete guidance on what would be relevant to users, data scientists ended up building a convoluted dashboard of metrics. The core issue of how to efficiently allocate vaccines to vaccination locales was not answered clearly because it was never defined. Instead, the portal addressed questions related to expiry date management and fleet management – surely relevant, but secondary to the leading issue.

When users – including local government staff, doctors, nurses and volunteers – failed to see the portal address their biggest challenge, they chose to rely on the old manual Excel spreadsheet to make their calculations, even if this was more laborious.

2. Low ease-of-use of the technology

Even when a relevant, targeted problem has been defined, and the data science team has managed to answer it, making the insights accessible and relatable to laymen users presents the next challenge. The role of user centricity, design thinking, and seemingly simple things such as a superior UX/UI are often overlooked when adopting AI solutions.

The Italian vaccine allocation AI had problems with its user interface: not only was it unclear to users what the solution was designed for, the interface itself was just a wild array of numbers, the underlying calculations of which users did not comprehend. Despite being fully aware of the drawbacks associated with the Excel-based predecessor solution, the users still chose to value a solution they knew how to operate than the more effective solution, albeit an opaque, AI-powered tool – a painful and expensive lesson for the Italian city on the superiority of solution usability over model accuracy.

US-based luxury fashion holding Tapestry, which is parent of brands Kate Spade, Coach and Stuart Weitzman, knows that instilling a more design-driven mindset to data science is key. Product allocation to distribution centers was historically run using a naïve algorithm, which has recently been replaced by an ML model with higher prediction accuracy. Even though the new model was performing better, the allocators frequently opted for the old one. A key reason for this was poor usability. “Like in many companies, the data science and allocation teams were using very different jargon,” Fabio Luzzi, VP of Data Science at Tapestry explains. “We did not understand that when allocators talked about style and colors, data scientists talked in math, resulting in our initial solution being ill-suited for its target audience. We discovered the value of adopting a human-centric mindset when we started to empathize deeply with allocators and their cognition processes,” Luzzi points out. What he and his team discovered was that allocators needed very particular numbers to pop up on their screens in a very easy-to-relate layout. If the numbers or layout changed, their routines were modified, they got lost and were unable to do their jobs efficiently. “We learned that a solution is only useful if its design ensures a fit with user habits,” Luzzi summarizes. “Once we incorporated that learning, we saw a significant adoption uptake and ultimately performance increases,” he concludes.

3. Poor integration of data science in the product teams and across the organization

Another key pitfall is when data science teams end up working in a vacuum, focusing solely on their technical task at hand and possibly getting carried away with the intellectual challenge. Organizational embeddedness of data science is key, making sure data science permeates throughout the company and across departmental boundaries. At the same time, giving the data scientists the full context of their work and instilling the vision for how it contributes to the end product is indispensable. Too often the emphasis is being put on developing code, rather than developing a product.

One company that also knows this well is tire manufacturer Pirelli, which generates over $5 billion in revenue yearly. Their digital transformation is designed to yield a shift from their current B2B towards a B2B2C business model. One part of that is smart tires equipped with sensors, and Pirelli being able to use the sensor data to provide information and services to car makers and fleets, thus getting closer to end customers. A key organizational enabler for this is the pivot toward a data-driven culture, with every single division empowered to employ its own approach toward data science. One data steward per division leads the charge and functions as the single point of contact to a centralized team. This allows Pirelli to create tools with the end users in each of the divisions in mind and avoid building models just for the sake of building models. It also enables them to break boundaries between departments. A focus on the unifying vision of an improved end product helps tear down organizational and data silos, accelerating the speed to impact.

4. Lack of effort to make people familiar and comfortable with the solution

Finally, when the solution is ready, requisite time often fails to be spent on socializing the solution around the organization, familiarizing teams with it, and demonstrating its value to users. Oftentimes solutions are not properly adopted because users do not trust them, choosing to stick to their established ways of doing things. Failure of data science efforts at this stage is particularly painful because resource investments have been substantial to get there in the first place. This played into the ill-received Italian city’s vaccine allocation optimization as well. Users were not trained on how to navigate the portal that was created for them, nor motivated or incentivized to adopt it, and ultimately chose to use their old allocation sheets instead.

Also consider Pirelli’s manufacturing division, which now uses an easy-to-operate algorithmic model to estimate the industrial yield of new products, superseding the prior Excel-based model. To demonstrate the benefit of the AI to users, they compared the accuracy of the profitability and yield prediction from the new model to that of the old, and thus illustrated to the team how the new approach would provide better insights, while also saving them time thanks to a much-improved user interface and experience.

“Whoever the end-user – from top managers to operational colleagues – if they aren’t able to use it, the solution has already failed. Only when people have trust that the solution will provide the value that was discussed and that it will be reliably delivered, there’s a chance that it ends up getting used,” says Daniele Petecchi, Head of Data Science and Data Management at Pirelli. “The neural network is not the end game. It is ultimately about allowing people to do their jobs more easily to create a better product. People must be able to trust a solution before they will end up using it,” he stresses.

Designing and deploying AI solutions that solve business problems with a clear value-add to users is an often-encountered challenge for digitally transforming organizations. Leaders that aim to build a data-centric organization or even an AI factory need to realize that the transformation is not achieved only at the level of technical capabilities. Of course, data quality, quality of code, and data privacy are a challenge, and so are the complexities of training an ML algorithm. However, transformations become successful by understanding what problem needs to be solved, focusing solely on those that represent value to users if addressed; ensuring ease-of-use of the technology deployed; building the path for data science to permeate throughout the organization, hence giving data scientists insights into how their solutions end up getting used by teams and how they add value to the end product; and ultimately creating trust in the solution by the people that will use it.

_______________

This piece is based on a cooperation between Forbes.com Contributor Hannah M. Mayer; Luca Vendraminelli, Postdoctoral Research Fellow at the Politecnico of Milan, Italy; and Timothy DeStefano, Associate Professor at Georgetown University. All three authors have current or former affiliations with the Laboratory for Innovation Science at Harvard (LISH), and the Digital, Data, and Design Institute at Harvard (D^3).