This installment of the Data Science Maturity Model (DSMM) blog series contains a summary table of the dimensions and levels. Enterprises embracing data science as a core competency may want to evaluate what level they have achieved relative to each dimension - in some cases, an enterprise may straddle more than one level. As a next step, the enterprise may use this maturity model to identify a level in each dimension to which they aspire, or fashion a new Level 6.
|Questions||Level 1||Level 2||Level 3||Level 4||Level 5|
|Strategy||What is the enterprise business strategy for data science?||Enterprise has no governing strategy for applying data science||Enterprise is exploring the value of data science as a core competency||Enterprise recognizes data science as core competency for competitive advantage||Enterprise embraces a data-driven approach to decision making||Data are viewed as an essential corporate asset - data capital|
|Roles||What roles are defined and developed in the enterprise to support data science activities?||Traditional data analysts explore and summarize data using deductive techniques||Introduction of 'data scientist' role and corresponding skill sets to begin leveraging advanced, inductive techniques||Chief Data Officer (CDO) role is introduced to help manage data as a corporate asset||Data scientist career path is codified and standardized across the enterprise||Chief Data Science Officer (CDSO) role introduced|
|Collaboration||How do data scientists collaborate with others in the enterprise, e.g., business analysts, application and dashboard developers, to evolve and hand-off data science work products?||Data analysts often work in silos, performing work in isolation and storing data and results in local environments||Greater collaboration exists between IT and line-of-business organizations||Recognized need for greater collaboration among the various players in data science projects||Broad use of tools introduced to enable sharing, modifying, tracking, and handing off data science work products||Standardized tools introduced across the enterprise to enable seamless collaboration|
|Methodology||What is the enterprise approach or methodology to data science?||Data analytics are focused on business intelligence and data visualization using an ad hoc methodology||Data analytics are expanded to include machine learning and predictive analytics for solving business problems, but still using ad hoc methodology||Individual organizations begin to define and regularly apply a data science methodology||Basic data science methodology best practices established for data science projects||Data science methodology best practices formalized across the enterprise|
|Data Awareness||How easily can data scientists learn about enterprise data resources?||Users of data have no systematic way of learning what data assets are available in the enterprise||Data analysts and data scientists seek additional data sources through "key people" contacts||Existing enterprise data resources are cataloged and assessed for quality and utility for solving business problems||Enterprise introduces metadata management tool(s)||Enterprise standardizes on a metadata management tool and institutionalizes its use for all data assets|
|Data Access||How do data analysts and data scientists request and access data?
How is data access controlled, managed, and monitored?
|Data analysts typically access data via flat files obtained explicitly from IT or other sources||Data access available via direct programmatic database access||Data scientists have authenticated, programmatic access to large volume data, but database administrators struggle to manage the data access life cycle||Data access is more tightly controlled and managed with identity management tools||Data access lineage tracking enables unambiguous data derivation and source identification|
|Scalability||Do the tools scale and perform for data exploration, preparation,
modeling, scoring, and deployment?
As data, data science projects, and the data science team grow,
is the enterprise able to support these adequately?
|Data volumes are typically "small" and limited by desktop-scale hardware and tools, with analytics performed by individuals using simple workflows||Data science projects take on greater complexity and leverage larger data volumes||Individual groups adopt varied scalable data science tools and provide greater hardware resources for data scientist use||Enterprise standardizes on an integrated suite of scalable data science tools and dedicates sufficient hardware capacity to data science projects||Data scientists have on-demand access to elastic compute resources both on premises and in the cloud with highly scalable algorithms and infrastructure|
|Asset Management||How are data science assets managed and controlled?||Analytical work products are owned, organized, and maintained by individual data science players||Initial efforts are underway to provide security, backup, and recovery of data science work products||Data science work product governance is systematically being addressed||Data science work product governance is firmly established at the enterprise level with increasing support for model management||Systematic management of all data science work products is used with full support for model management.|
|Tools||What tools are used within the enterprise for data science objectives? Can data scientists take advantage of open source tools in combination with high performance and scalable production quality infrastructure?||An ad hoc array of non-scalable tools is predominantly used for isolated data analysis on desktop machines||Enterprise manages data through database management systems and relies on extensive open source libraries along with specialized commercial tools||Enterprise seeks scalable tools to support data science projects involving large volume data||Enterprise standardizes on a suite to tools to meet data science project objectives||Enterprise regularly assesses state-of-the-art algorithms, methodologies, and tools for improving solution accuracy, insights, and performance, along with data scientist productivity|
|Deployment||How easily can data science work products be placed into production to meet timely business objectives?||Data science results have limited reach and hence provide limited business value||Production model deployment is seen as valuable, but often involves reinventing infrastructure for each project||Enterprise begins leveraging tools that provide simplified, automated model deployment, inclusive of open source software and environments||Increased heterogeneity of enterprise systems requires cross-platform model deployment, with a growing need to incorporate models into streaming data applications||Enterprise has realized benefits of immediate data science work product (re)deployment across heterogeneous environments|
Click here for the Data Science Maturity Model spreadsheet and here for the whitepaper. I hope you found this series useful and welcome hearing from you regarding your experience using this Data Science Maturity Model.