Career spotlight: Data scientist

View all blog posts under Articles | View all blog posts under Data Science

As a formal field of study, data science basically did not exist twenty years ago. Although working with large sets of information – whether statistics about an upcoming election, clinical trial results for a prospective medication or customer contacts lists from a Microsoft Excel sheet – is nothing new, the concept of grouping all of it relevant practices under the umbrella of data science is of relatively recent vintage:

  • The term entered the mainstream in the late 1990s, when the International Federation of Classification Societies used it in the title of its 1996 conference in Kobe, Japan. Not long after that, several journals devoted to data science, including the Journal of Data Science and Data Mining and Knowledge Discovery, commenced publication.
  • In its early days, data science as a discipline was closely associated with statistics, such that one professor in Michigan even suggested renaming statistics “data science” in 1997. This association has persisted, with Google economist Hal Varian telling McKinsey in 2009, on the subject of data science, that “statistician” would be one of the hottest job titles of the coming decade.
  • Amazon Web Services, the world’s largest Infrastructure-as-a-Service platform, launched in 2006, kickstarting a revolution in how organizations accessed IT resources. The ability to easily consume computing power, storage capacity and network services on-demand, over the internet, enabled data scientists to accumulate, sort and analyze information at much greater scale than ever before.

Looking at this short history, we can broadly define data science as the intersection of statistics and computer science: More specifically, data scientists combine the analytical skills of statisticians with the technical capabilities of computer programmers to extract insights from the enormous datasets of the cloud computing age. Their diverse backgrounds have also put them into high demand among organizations of all kinds.

A day in the life of the data scientist: 4 key skills from the role

Data scientist responsibilities vary greatly from company to company. However, at a basic level everyone holding the position will be expected to be proficient in data analysis as well as various technical tools, include everyday applications like Excel, programming languages such as Python and R and big data frameworks in the mold of Apache Hadoop.

We can sort these competencies into four major buckets, which together serve as the magic ingredients of data science success:

1. Mathematics

On the surface, a strong command of statistics, linear algebra and multivariable calculus may seem redundant for the modern data scientist, since there are so many platforms for automating mathematical operations. But these skills are central to many data science jobs, in which scientists are tasked with creating custom software implementations and statistical models for their organizations – especially ones that are not “data companies” per se and are building their analytics infrastructure for the first time.

2. Machine learning

This concept refers to the ability of computer systems to evolve without being expressly programed to do so. For example, an algorithm might correlate different pieces of data to conclude that a photo included a horse, and then use that metric to identify horses in other pictures in the future. Cloud-based machine learning is expected to become a $3.75 billion market by 2021, according to MarketsandMarkets. Data scientists play pivotal roles in setting up, supervising and modifying the machine learning programs that are the backbones of many data analytics initiatives.

 

3. Programming

Machine learning is a quintessential focus area for data scientists, in that it, like most of the other pillars of the field, is further enhanced by programming skills. Knowing how to manipulate data via Scala, SQL, Java and the other platforms we have mentioned so far allows data scientists to scale their analyses. The 2016 O’Reilly Data Science Salary Survey revealed that 90 percent of respondents spent at least some time coding, while 80 percent used at least one of the Python, R and Java trio. Python, JavaScript and Excel were the most heavily utilized tools among these data scientists.

4. Communications

Data scientists usually do not analyze data for its own sake. Instead, they might do so to produce information that helps a business see how a product is used and how it stacks up against competitors, or to clean up all the messy databases in a company’s legacy IT system. Such tasks – which, as forms of data preparation, constitute 79 percent of the work data scientists do, according to a CrowdFlower survey – require tight communication with project managers, C-level executives, other IT personnel and line-of-business end users.

The future of the data science profession

As skilled professionals in a rapidly evolving industry, data scientists are widely sought after by employers. The U.S. Bureau of Labor Statistics has projected that the broader category of computer and information research scientists would see 11 percent growth – or faster than average – from 2014 to 2024, with the addition of more than 25,000 positions in that timespan. The salary for workers in such positions had a 2016 median salary of $111,840, which was well ahead of the U.S. national median at that time.

According to rjmetrics, more than half (52 percent) of data scientists in 2015 had earned that job title within the last four years, indicating how the profession is still being defined. For context, there were virtually no data scientists as recently as 1995, per its research. McKinsey expects data science job supply to easily outpace demand in 2018, predicting a shortage of up to 190,000 employees with analytical expertise and 1.5 million managers and analysts with the confidence to make decisions based on big data reports.

Graduate education can help current and prospective data scientists deepen their knowledge, open doors to new jobs and ultimately increase their earnings potentials. A master’s degree in data science from the University of California, Riverside will prepare you for a rewarding career through a mix of advanced technical and management skills. Get even more information on the data science program specialization overview page.

Recommended readings:

https://engineeringonline.ucr.edu/blog/most-exciting-career-of-the-21st-century-the-data-scientist-shortage/

https://engineeringonline.ucr.edu/blog/what-is-the-difference-between-a-data-scientist-and-data-engineer/

 

Sources:

https://blog.rjmetrics.com/2015/10/05/how-many-data-scientists-are-there/
https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/#217936576f63
http://www.oreilly.com/data/free/files/2016-data-science-salary-survey.pdf
https://www.bls.gov/ooh/computer-and-information-technology/computer-and-information-research-scientists.htm
http://blog.udacity.com/2014/11/data-science-job-skills.html
https://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science/#505a76f255cf
http://www.springer.com/computer/database+management+%26+information+retrieval/journal/10618
http://www.marketsandmarkets.com/Market-Reports/machine-learning-as-a-service-market-183667795.html
https://www.toptal.com/machine-learning/machine-learning-theory-an-introductory-primer
https://www.informationweek.com/devops/programming-languages/10-programming-languages-and-tools-data-scientists-use-now/d/d-id/1326034?image_number=4
https://www.thinkful.com/blog/4-skills-you-need-to-become-a-data-scientist/