A Dearth of Data Scientists

By John Burke
On Jan 15, 2013
Tuesday, January 15, 2013

If we had to pick a collective noun for data scientists -- you know, the word that stands for a group of them, like a parliament of owls or a murder of crows or a pride of lions -- I submit that the right word for the moment is, a dearth.  This is because, as our recently launched 2013 enterprise technology benchmark is already making clear just one week into interviews, companies are coming to recognize and to feel acutely the lack of staff who are comfortable with and capable of analyzing data and distilling from it meaning, and able to condense and present that meaning concisely.  One recent interviewee was bemoaning the fact that they were just finally able to hire someone in a data analytics position -- but that one person would be shared across the whole company, so only a few projects and categories of data would benefit from her attention.

Yesterday I spotted a press release describing an upcoming MOOC (massive open online course) dedicated to data visualization -- teaching people how to analyze and condense data into charts or figures that would convey information densely but comprehensibly.  It is being offered out of Indiana University, and is among the first MOOCs from IU, and in the press release the university cites the fact that it is among the first to allow students the chance to work with "real clients" including businesses, government agencies, and researchers in many fields.

It seems to me that this MOOC is part of the opening trickle in what will become a flood.  We will see, in short order, hundreds of similar offerings and a rapid expansion of relevant for-credit academic and professional certification programs involving everyone from community colleges to research universities such as IU to big vendors in the big data space, such as IBM and EMC.  We saw something similar back in the 90s, when the enterprise suddenly needed network admins and programs—heck, an entire industry—arose to provide them.

One significant difference between the rise of the network admin and the needed/desired rise of the data scientist is preparation: you can take a lot of folks whose background was not IT and bring them up to speed as a network admin in a relatively short time.  It is harder to take someone without a fair background in and comfort with math and turn them into a data scientist.  This suggests that we may be stealing from the pool of math-happy folks we have already tapped pretty well in healthcare, financial services, and elsewhere, meaning competition for scarce resources, meaning escalating salaries.  It also suggests that “as a service” offerings will arise and flourish. 

Maybe in a couple years I’ll be talking about a charting of data scientists or the3 like.  For now, though, it’s a dearth.



I think I need to take that class John~