Learn data science best practices

  • April 16, 2018

Growing a Data Function in a Data-Immature Environment

According to Gartner, 90% of large organizations will have a Chief Data Officer by 2019. This manifesto of data-drivenness in today’s business world hardly comes as a surprise in the age of Big Data and AI. And alongside large companies, organizations of more modest size are also getting the memo and jumping on the bandwagon.

This means that many data leaders will soon be called to support the ambitions of these companies and step in as new Chief Data or Analytics Officers, Chief Data Scientists, or Data Science managers. Regardless of their title, these individuals will be assigned the same challenging task: infusing a culture of data across the board.

As the instigator of a new culture within their company, the data leader will go through an incredibly enriching experience, but also meet many hardships and frustrations. However, of all the challenges they will face, the most tedious one will be to determine the optimal way to build credibility and communicate effectively with the stakeholders and their peers, regardless of their respective backgrounds and areas of expertise.

Communicating with Stakeholders

Since the desire to initiate a data function often originates with the decision-makers themselves, these new data leaders generally don’t have to work too hard to sell senior leadership on the impact of data to the business. However, their luck usually ends right here, as most companies are embracing the trend mostly out of a fear of being left behind and have little to no knowledge of what becoming data-driven actually requires of them.

An important part of a data leader (and evangelist) revolves around educating stakeholders. This generally starts by managing the expectations of a novice audience and conveying that data science is neither “dark magic” that can resolve any problem, nor a temporary fad in the tech industry. This tends to be a particularly tedious task due to the amount of hype around artificial intelligence and machine learning, and the lack of appropriate terminology differentiating the various areas in data science.

Companies frequently invest in a data team because they have accumulated significant amounts of data that they hope to capitalize on it somehow. This is encouraging, as it indicates that they embrace the monetizability of their data, a belief at the very core of data-centricity. However, uninformed decision-makers usually start a data function without a specific agenda in mind and also fail to realize that some, if not most, of the data might not be actionable. This is where the talents of the data leader gets put to the test: he or she has to be able to convey a compelling argument to both technical and non-technical audiences on what investments need to be made in terms of instrumentation, third-party data, or technology.

Finally, leaders of a young data function will have to clarify to business stakeholders that return on investment for a data science team is substantially different from other engineering functions. Data departments are particularly expensive to build up and run because of both the cost in labor and technology. Substantial amounts of money are typically invested until customers can interact with their first data product, and this can, unsurprisingly, be a tough pill for both investors and key leadership to swallow. That’s why successful data science leaders strive for communicating their team’s progress continuously, even while the team is still working on building their infrastructure.

Working with Engineering Teams

Organizations that aspire to data-drivenness often start out as engineering-focused organizations and their engineers have long lived and breathed methodologies and processes optimized for their trade. What makes data departments so unique and challenging to manage, though, is the dual nature of data science, which is part-engineering, part-research. Because young data teams are almost always embedded into the engineering organization initially, data leaders are frequently misunderstood or even treated condescendingly by their engineer peers.

Interestingly though, because they are traditionally smaller and function under a high level of uncertainty than engineering teams, data science teams might just be perfect candidates to adopt the Agile methodology, assuming of course that the necessary modifications are brought to accommodate them. A successful Agile framework is highly desirable because it will simultaneously improve the efficiency of data teams, significantly limit duplication of work, and make the entire data science process more relatable to the engineering community. That being said, while the Agile methodology has been developed and perfected for engineering teams over the years, it is still unclear what Agile means for data scientists.

Ultimately, it is critical that engineering teams that collaborate closely with data scientists acknowledge and respect the difference in process and pace, and it is yet again the data science manager’s role to drive this cultural change.

Engaging the Product Managers

Product managers are a particularly important element to the success of data teams who develop customer-facing data products, such as search engines, recommender systems or news feeds. However, there are quite a few differences between mainstream product management and data product management.

For instance, the data product manager needs to show more proactivity, and a genuine willingness to leave more room for innovation (simply put, they must be willing to take some risks). And naturally, it is again the role of the data science manager to instill this new mindset. In order to develop reliable models, data scientists need a fair amount of historical data, which requires that data be collected months—if not years—in advance.

The data product manager has to constantly anticipate the product needs for the companies and its customers. He or she also has to adopt a different position by identifying problems to solve rather than just suggesting solutions. Indeed, the data science process is based on a trial-and-error approach, and identifying the best approach to a specific problem is better left to an experienced data scientist.

Rather than gathering product requirements, the data product manager has to partner with the data team in order to explore all the possible solutions to the customer’s problem while remaining open-minded and avoiding any preliminary assumptions.

Attracting Top Talent

Last but not least, data leaders will have to take on the difficult task of recruiting. Attracting top talent is hard for mature and immature data organizations alike, because access to data talent is low overall. However, hiring data scientists in young data functions comes with two additional challenges:

  1. The top contributors to emerging data science teams are typically self-starters (a characteristic often met in more experienced people).

  1. With no prior projects using the company’s data, it is often hard to tell a good story in order to attract talent.

It is critical to note that the average data scientist is usually not the right person to fill the role of founding member of the team. Perfect candidates for such roles are outstanding data scientists who also have strong software engineering skills as well as a lot of creativity and a strong bias for action allowing them to get things done with a minimum amount of resources. Such professionals are evidentially rare, expensive and can easily get jobs for the most prestigious organizations, which leaves little wiggle room for data-immature companies.

When an established company suddenly starts hiring for a brand-new data team, candidates tend to question if the company is taking their data strategy seriously. They (understandably) wonder what took the company so long to invest in data. As a consequence, it is much harder to sell them on the job. Data scientists are naturally attracted by interesting data and challenging problems to solve, and it can be difficult to tell a good story when the initiative is brand new and there aren’t many stories to be told.

A Setup for Success

Building a data function from scratch is a tedious task not for the faint of heart. In fact, many senior data science talent chooses to remain in individual roles for that very reason.

Since successfully establishing a data function can only be done within the right environment and with the support of the right people, the best data leaders are those who, through patience and perseverance, find ways to educate and get the entire organization involved in the process.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.