Monday, September 29, 2014

The End of the Data Scientist Bubble...

By: Jean-Pierre Dijcks | Master Product Manager

Looking around northern California and inside many technology kitchens makes me believe that we are about to see the Data Scientist bubble burst. And then I read the Fortune Magazine article on Peter Thiel - and the excerpt on Zero to One (his new book) in that article and it dawned on me that is one of the intersting ways to look at the Data Scientist bubble.

Thiel's Classification of Innovation

Without trying to simplify and/or bastardize mr. Thiel's theory, the example in the Fortune Mag article will make this visible to most people (I hope). In the article the analogy is; going from one type writer to 100 type writers is 1 to N, inventing a word processor is moving us from 0 to 1. In other words, true innovation dramatically changes things by giving previously unknown power to the masses. It is that innovation that moves us
from 0 to 1. Expansion of existing ideas - not true innovation - moves
us from 1 to N. Of course, don't take my word on this but read the article or the book...

The Demise of the Human Data Scientist

The above paradigm explains the Data Scientist bubble quite nicely. Once upon a time companies hired a few PhD students who by chance had a degree in statistics and had learned how to program and figured out how to deal with (large) data sets. These newly minted data scientists proved that there is potential value in mashing data together, running analytics on these newly created data sets and thus caused a storm of publicity. Companies large and small are now frantically trying to hire these elusive data scientists, or something a little more down to earth, are creating data scientists (luckily not in the lab) by forming teams that bring a part of the skillset to the table.

This approach all starts to smell pretty much like a whole busload of typewriters being thrown at a well-known data analysis and data wrangling problem. Neither the problem nor the solution are new, nor innovative. Data Scientists are therefore not moving us from 0 to 1...

One could argue that while the data scientist quest is not innovative, at least is solves the problem of doing analytics. Fair and by some measure correct, but there is one bigger issue with the paradigm of "data scientists will solve our analytics problem" and that is scale. Giving the keys to all that big data to only a few data scientists is not going to work because these smart and amazing people are now becoming, often unbeknownst to them, an organizational bottleneck to gaining knowledge from big data.

The only real solution, our 0 to 1, is to expose a large number of consumers to all that big data, while enabling these consumers to apply a lot of the cool data science to all that data. In other words, we need to provide tools which include data science smarts. Those tools will enable us to apply the 80% common data science rules to the 80% of common business problems. This approach drives real business value at scale. With large chunks of issues resolved, we can then focus our few star data scientists on the 20% of problems or innovations that drive competitive advantage and change markets.

My Conclusion

The bubble is bursting because what I am seeing is more and more tools coming to market (soon) that will drive data science into the day-to-day job of all business people. Innovation is not the building of a better tool for data scientists or hiring more of them, instead the
real 0 to 1 innovation is tools that make make all of us data scientists
and lets us solve our own data science problems. The future of Data Science is smarter tools, not smarter humans.

Join the discussion

Comments ( 14 )
  • guest Tuesday, September 30, 2014

    People unable to provide added value on data are not data scientists. Because one of the key components of data science is providing added value on data of all kinds. They might call themselves data scientists, but they are fakes. Real data scientists do exist, I'm one of them, and I provide added value to my company (I'm the CEO) and its clients, leveraging automated data science for the benefit of everyone. People like me are a dime a dozen, paid very well, but rarely found in large companies. This being an Oracle forum, it might explain why you think there's a bubble - a bubble of fake data scientists indeed.


  • guest Wednesday, October 1, 2014

    The "bubble" is going to burst because "1 to N" "paradigm" in some "technology kitchens" with some "big data" "organizational bottleneck"s in some Fortune article.

    You sound like you went to business school.

    I'm not a "data scientist", but I do some of the same things. All my employer's clients, and most corporations in general, are in desperate need of help with their data (not my job title but it's always been a big part of my job). This is only increasing with time.

    No matter how many paradigms you have, there's no magic software that can figure out business needs, then bring together all the data, clean it up, and interpret it. Software tools are improving but none I've seen are anywhere near that.

    A "data scientist bubble" is a pretty far fetched theory from my viewpoint. If you want to argue that there is a bubble, you need a clear argument and you don't have one.

    I've got to agree with the other guest comment. You and he both sound like you're trying to sell something, but he sounds like he knows what he's talking about and you sound like you've read a "paradigm" in a business magazine article trying to sell you a book. How you got to be a bigshot at Oracle, I don't know.


  • guest Thursday, October 2, 2014
  • guest Thursday, October 2, 2014

    There is a reason why "companies large and small are now frantically trying to hire these elusive data scientists". It is because data scientists are the only people within organizations that have the combined business and technical skills required to deliver on the promise of Big Data. As a predictive analytics consultant, I've personally witnessed bad decision-making made by non-technical analysts as a direct result of using these so-called "smart tools". By analogy, it's akin to giving a carpenter some architectural guidelines and telling him that he, too, can now be an architect. Would you have him build YOUR home? To your point, I agree that software companies have been trying to convince their clients that it's now time to bring the masses closer to data and the analytics -- with the argument that the cost savings associated with getting rid of those pesky, expensive, hard-to-find, slow-in-producing PhD types will be more than offset by the value of the resulting mass-produced "dumbed-down" analytics. Only time will tell how things will unfold. In terms of a bubble? I'm not worrying because, at worst, instead of getting involved in analytic projects on the front-end, I'll make my money on the back-end fixing the problems created by people like YOU!!


  • guest Thursday, October 2, 2014

    I work for a small company and our data is cleaner and more organized than at any point in its past. We do not have data scientists, we have a staff of two novice analysts. We work with both opensource and commercial database products and Hadoop is off on the horizon.

    Maybe this is naive, but in the past three years we went from distributed legacy systems to coherent, modern systems and it took some tough decisions, but the next big systems change will be much much easier. If this occurs once more in the next two or three years, our staff will never go down the path of a full-time data scientist.

    Not because our data is perfect, but the number of data prep, API as a Service, MLAAS, you name it... will be so ubiquitous that we will be far more likely to send our data through a 5 step process to acquire our retention prediction scores, or whatever.

    So, maybe my company would have never hired a data scientist. I can agree that would have been like using a fire hose to douse a candle, but even if we were tempted, the temptation has faded.

    As a rebuttal to an above comment - "There is no magic software that can figure out business needs". Well, does that mean that there won't be? Even with my small intellect, I can see a time when your systems are running several different sets of business rules at the same time and allowing C-Suite members to interpret what direction the business should be moving. Thereby negating the need for decision making at the beginning of the process. Why not allow all data to run thru 20 odd scenarios and make adjustments along the way?


  • guest Thursday, October 2, 2014

    This is a hoot coming from "the product management team" in a company that can't figure out it's own products, let alone how to use them.

    If can't come up with original thought, no worries, just steal or trash someone else's. ...the second option is easier.

    Way to go. Sell more product.


  • guest Friday, October 3, 2014

    who will build smart tools ?

    smart data scientists, and as long as we need smart tools will need smart data scientists ...


  • guest Friday, October 3, 2014

    There is a reason why "companies large and small are now frantically trying to hire these elusive data scientists". It is because data scientists are the only people within organizations that have the combined business and technical skills required to deliver on the promise of Big Data. As a predictive analytics consultant, I've personally witnessed bad decision-making made by non-technical analysts as a direct result of using these so-called "smart tools". By analogy, it's akin to giving a carpenter some architectural guidelines and telling him that he, too, can now be an architect. Would you have him build YOUR home? To your point, I agree that software companies have been trying to convince their clients that it's now time to bring the masses closer to data and the analytics -- with the argument that the cost savings associated with getting rid of those pesky, expensive, hard-to-find, slow-in-producing PhD types will be more than offset by the value of the resulting mass-produced "dumbed-down" analytics. Only time will tell how things will unfold. In terms of a bubble? I'm not worrying because, at worst, instead of getting involved in analytic projects on the front-end, I'll make my money on the back-end fixing the problems created by people like YOU!!


  • guest Friday, October 3, 2014

    "The bubble is bursting because what I am seeing is more and more tools coming to market (soon) that will drive data science into the day-to-day job of all business people. "

    Bring the tools and I'll bring the potatoe chips and we can both just watch it all burn.

    PS the math question below is HARD, I need a tool!!


  • guest Friday, October 3, 2014

    There is a 50% probability this was ghost written by the junior intern for the marketing department of Oracle. Or not. LOL.


  • Murray Newton Saturday, October 4, 2014

    I don't think the author is saying that there is no need for a Data Scientist, but rather that poor reporting and analytics in the past have created heightened demand for the notion of needing a Data Scientist.

    Now that a lot of talented people in a lot of companies have improved those tools, the demand for Data Scientist will be reduced. Certainly there is still room for those that can really understand how to 'ask a question' of the data and find the answer.

    My personal interpretation of this blog post is that the author is probably correct.

    Some 'guest' posts would make me think that they see the writing on the wall - however are in the business of selling their services as a "Data Scientist for hire" - and need to discredit the notion that analytic and reporting engines are improving. Shame on you. If you are a self proclaimed Data Scientist and are bringing value, you should be fine and don't have to mud sling. Are you really going to say that technology hasn't advanced here?


  • guest Wednesday, October 8, 2014

    I'm afraid I will disagree with most of the comments, and actually agree with the author in the following: "Data Science" has not enriched a bit our arsenal of mathematical and statistical theory and models whatsoever, hence it is not a paradigm shift. Effectively we use techniques that are well known, extensively studied and deeply understood by real scientists the last 50-100 years (logistic regression, classification trees and the such). The only advances that are associated with "Data Science" are technological and not scientific.


  • guest Wednesday, October 8, 2014

    "Data science" may not be real science, and "data scientist" might be a bullcrap job title, but the job descriptions posted under "data scientist" are skilled work and high value. And the demand for this work is only going to increase over time, as the amount of data and potential uses of it increase.

    To make the point: tools like Wordpress made it so only mildly skilled workers can make a sophisticated webpage. Did the "web developer bubble" collapse? I know a lot more web developers making good money now than 10 years ago.


  • guest Tuesday, August 15, 2017
    A good data scientist is not a robot that feeds data in his pipeline and then pulls out predictions to present. He has to be a domain expert, a good analyst, and a good story teller, as well as a good machine learning engineer.

    I agree that there are so many machine learning and visualization toolbox that relieved us from coding from the start. But a good toolbox does't mean you don't need to understand it to use, and for most advanced machine learning techniques, you actually need a lot of experience to tune and train it right for the job.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
 

Visit the Oracle Blog

 

Contact Us

Oracle

Integrated Cloud Applications & Platform Services