Blacklist Rumors Debunked, Data Scientist Salaries [DiceTV]


Do staffing firms ever blacklist IT Consultants? …Are there $140,000 salaries out there to be had by data scientists? …And are you aware that we’ve added a new Dice Talent Community? I’m Cat Miller and this is the DiceTV Update for Tuesday, June 12, 2012.

15 Responses to “Blacklist Rumors Debunked, Data Scientist Salaries [DiceTV]”

      • Thanks, but does your website collect all relevant information to its purpose conveniently in one place, or does it direct users to google everything?
        I think this would be something beneficial to post here.

        Also, those terms are somewhat soft, representing ideas. The results for such a search do not give clear indications of the trusted, industry standard. The first 5 Google searches provides results for 2 different certifications. While Dice is trying to attract readers by using their expertise to recommend an up and coming trend, I am trying to draw further on Dice’s expertise to recommend a particular certification best suited to that trend. That can be better obtained from someone in the field and not as easily from Google. If you had your way, Dice would cease to recommend even up and coming trending ideas so we could instead just google for them.

        But thank you for being snide. If it weren’t for you, the internet wouldn’t be what it is.

    • Hi Antarr,

      If your under graduate is in a technical field such as Math, Comp Sci, I.T, Statistics etc and you wnat to pursue an MBA in finance or something then I would say yes.

      • Antarr

        I am currently on track to start in the graduate MIS program at my university. This program is offered by the Business School. My undergraduate degree is a Bachelors of Science and Computer Science. I currently work as a developer.

    • Hi Frances,

      Take out “too old” and replace it with “experienced”. Success has more to do with your attitude, thinking and mindset, then your age. Ray Krock, founder of MacDonald s and who revolutionized American businesses, didn’t get his break until mid 50s. Col. Sanders of KFC met success in his 60s. Start thinking in terms of how you can create value, in the “Big Data” field, people who “KNOW” what the data means are extremely valuable, and that knowledge only comes through years of experience, and no college or university or degree etc can teach you that. If you are comfortable with computers and have basic skills ( I am just making assumptions here) then take a course in Excel, and become very good with Excel, it is used heavily in the business world and can open that proverbial door for you. Getting started and taking that first step forward is the key…

      • That’s a nice response – but the reality is that many of the HR gatekeepers these days are blatantly asking what year you graduated from high school – after they ask what year you received your college degree. Clearly, they are using this as part of their weeding out process before passing resumes on to the hiring manager. This is serious, in your face, age discrimination. These gatekeepers, in my opinion, are clearly missing out on some of the best candidates – not to mention they are crossing the line.

  1. Hi Jason and Antarr,

    I work in the so “Big Data” field. So I will try and shed some light on the subject in a very

    concise manner, which is not an easy task. Here’s how it works, massive amounts of data from

    various source systems such as OLTP (Online Transaction Processing) systems, Databases,

    SpreadSheets, Web and other OLAP (Online Analytical Processing) systems etc have to be first

    extracted into specialized Data Warehousing systems, where this data goes through a Transformation

    (Understand that the data that is being extracted from all these different souces could be

    structured or unstructured or in different formats and data types) so first it has to be extracted

    and harmonized so that there are no discrepancies etc. Then this data is categorized, and for

    that purpose data models are created to store and house this data, these data models are

    represented as Data Structures which are generally called Data Sores/Cubes/Sets etc. Then the

    data that is stored in these data structures is accessed by software tools which allows the

    developer to build and structure this data into an application, KPIs, dashboards etc, which is

    then accessed by the business analyst and now the business analyst uses software tools to slice

    and dice, data mine and other kinds of techniques to find trends, historical analysis, predictive

    analysis, statistics etc etc. This is an extremely brief over view, but should give you an idea

    that from data generation, extraction and consumption, there are many steps and roles involved.

    Very broadly speaking these roles ar categorize as backend and frontend roles, backend roles are

    more technical in nature, where knowledge of programming, databases, networking etc are essential.

    As the data moves down stream the skills start leaning more towards the business side of things

    and less programming etc. But not necessarily, it all depends on what your business does with the

    data, it could be that once you get this data you might be involved in data mining where you will

    have to be comfortable with maths, statistics, sql, sas etc. A lot depends on your own interests,

    background etc. In my opinion, a combination of programming and accounting would serve you well.

    In other words a combination of business and technical skills is ideal. If you are just starting

    out then, I would get the technical skills first a degree, build that as my platform, then pickup

    courses in accounting, finance and then learn some specialized software. Unfortunately

    traditional colleges and univerisities are lagging behind in this education, you might have to

    look elsewhere, surprisingly community colleges are offering more courses that are practical and

    applicable towards work, its probably due to the fact that they have less red tape, many courses

    are taught by people who are actually working in the field and know what is current and relevant.

    Hope this is helpful. I wish you the best of luck and hope you achieve success in your chosen


    • Adding to ATI. What is being described by ATI is data warehousing and business intelligence. Maybe even sales operations and finance. Each of these has it’s own degrees of specialization and good purpose, but doesn’t represent data science well. The video also simplifies the model a lot.

      ATI is correct about a centralized database with unified definitions and cleansing of data. These databases called data warehouses are typically used for reporting and in some cases, with scalable data warehouses called data warehouse appliances, advanced analytics. The difference is in how the data is stored (column wise or row-wise, column enhances data search and compression). The data in the warehouse typically comes from applications like websites, customer relationship management software, marketing or channel management (embedded website code plus posts to Web forms) etc… sometimes intermediate storage sources can be encountered. People doing this job are termed database developers or data warehouse developers. Data Warehousing tends to be logical, driven by tests and requires a fair amount of business savvy. Described by my IT trainer as 60% mba 40% IT. Big focus for this role is setting up software, writing database specific code, optimizing return results from the database and designing processes that require a lot of forethought as you have to think about keeping the data consistent, meeting business rules and embedding it into a larger network of such code. You have to breath SQL and interface (talk to) the business intelligence team. Modern warehousing can involve more exotic technology like Hadoop or Redshift, but this depends on the businesses needs (how much data they generate and if they serve a data intensive function). This job pays 60-120k with newbies being close to 60 and architects often over 125k mark (note 120k…ballpark is experienced IT mean ceiling, which is the mean of the max of most professions and often represents mastery, based on glass door averages).

      Business Intelligence focuses on past data and defining metrics to measure the business. Typically, the business intelligence unit uses something called a business intelligence platform. Some examples are Tableau focused on dashboards, Good Data and Cognos (focused on reports, schedules and standardized reports). BIG typically has BIG admin/system architects, developers focused on reports/meta data and business analysts that focus on understanding the business need. BI is considered an application tier and so manipulates data to show to an audience. BI tends to be closely related to Data Warehousing in that it represents the presentation of all the different aspects of a firm (finance, operations, hr). This is of course an ideal. One major function of BI and DW is to increase the dimensionality and accessibility of data, to crate performance metrics, to resolve conflict in the terminology of data and to prevent ambiguity caused by dirty data (fuzzy matches). See data governance. BIG roles tend to be more specialized due to there being many vendors. Most firms use multiple tools and often analytic (erm…maybe) and reporting tools are included in other systems (like crm, erp…etc). The focus of roles here tend to be on presenting data with the admin/system people dealing more with configuration or occasional development. Depending on the tool the more technical people might mix BI role with data warehousing. The Business Analyst tends to be less technical and more focused on documentation and being the voice of the business. Salaries tend to be a little lower than data warehousing, but the roles vary more with beginning business intelligence analysts making around 50-80k…with elite firms typically adding a 10-20k premium (Tech oriented people make data warehousing like salaries minus 10k since warehousing is a little less technical). People with strong business skills and good communication skills are desired here with more senior roles tending to get more technical with increased experience). Developers are prized for there debugging skills and ability to understand SQL from the database.

      Data Analysts or X analysts (X is a specific dept, analyst is more analytic and report focused). Are data oriented specialists focusing on a specific departments needs. Often called pier users. They function similar to BI analyst role. Common tool is MS Excel, Access, SQL and department specific applications like Ana plan, sales compensation software and marketing campaign software. Typically more driven by the goals of the department often deals a lot with the BI team to source data. Similar salaries to BI analyst and developer though with slightly different functions (could get into the details here, but long response already and they aren’t my specialty).

      Cubes are special case of pre-aggregated database queries with the ability to slice in a Excel add-in or BI tool. Excel MS pivot table or power pivot are typical.

      Big Data and Data Engineering. Big data just means to big to be processed by a single computer and needs to be done on many computers simultaneously to get the results (called being run in parallel). These systems come with specialized storage, often require use of computer languages and a different way of thinking (what if a computer fails, isn’t responsive etc). This is changing as they’ve added extensions like Hive for Hadoop that turns SQL used in databases to this process. Scala is currently getting a lot of attention here as it’s JVM based functional language with strong typing, good libraries for parallel processing and machine learning. Most people in this field are computer programmers, data engineers (engineer typically more software oriented) or back-end engineers (there is a lot of variety here). Some engineers specialize in machine learning etc (so part engineer part data science specialist). Other topics in this field include No SQL systems a catch-all name for a variety of different technologies like MongoDB, cassandra, BASE and Neo4j. These systems use different data structures then traditional databases. MongoDB for example uses the document, with tags (not a Mongo expert, but similar to XML and JON with apis). This is semi-structured data. Neo4j uses graph architecture with two objects being connected by some type of relationship (custom built something similar it is very cool as you can query relationships etc between objects). These fields tend to pay better than data warehousing, but require typically higher technological savvy as you have to code, get how the systems work and still consult the business. Usually software engineering job (find software engineering is the most technical field).

      Finally Data Scientist! This role is part business analyst, part statistician and part engineer (though typically less than an actual engineer). Data Scientists grab data often from big data systems, data warehouses, apis (typically urls that return data, service etc) and in rare cases websites (hmtl etc) combine them together in special tools that can manage millions or even billions of rows of data (depending on big data tools available) typically specially designed computer languages with code called libraries developed by mathematicians, statisticians and other math savvy individuals to look at data for trends and patterns. Typical data scientist will gather data to answer business questions (or consumer questions) etc, clean the data to make sure it is easy to analyze or run through math models (darn things are temperamental), create visualizations graphs (since viewing million points at once is crazy and seeing multiple points by different attributes is hard), munge the data to view it from different angles (example is cyclic data by month or week for website…molding the data to check differences etc), running statistics/distributions/properties/probabilities of the data, modeling it (modeling is exhaustive by itself, models have trade-offs, interpretation vs more dynamic fit etc, time series, combining multiple models etc). Then once patterns, anomalies etc are found action has to be drawn from the data and it’s uses derived, which can involve reintegration it into software, making the data available to customer relationship management software (lead qualification and scoring) or changing business policies. Typically data science is best led by operations/finance where impact can be quantified, but often led by engineering where it integrates into products. The role is tough for several reasons. You answer to the business (often revenue side), deal a lot with data systems as you are one of the end users (unlike the end user you get data at a granular level), often have to integrate with engineering and products (you better know some programming), have to understand programming to access the libraries and equipment to perform data manipulation at scale (software auromation, which is hard cause you are automating high-level abstract math, no gurantees), have to get math, stats, viewing numbers abstractly, statistics, mathematical modeling (simulation etc) and how to explain it to business in substance way. Based on this you can understand the rigor of the field and why it takes a ton of practice and training. The MBA with Computer Science only makes sense if you are data savvy and really can get deep into the numbers (understand when to apply certain models, stats methods and programming techniques). The above explains the 110-140k salary and why PhD in physics, math and other fields (typically dealing with modeling or research are common).

      Why people say avoid BI when seeking data science positions. BI is support function. BI focuses on past data. BI tends not to be focused on exploration (self-service BI is still generally focused on present and rarely uses statistics 😛 for all those detractors). BI typically deals with reporting numbers. BI typically isn’t rigorous from programming standard or technical enough (that said I’ve seen former BI analysts turned data scientist). Good avenues into BI: computer programming with logical bent, statistician that is programming savvy, researchers that are programming savvy and technical people that get data and put the time to get the math.

      Best advice I can offer: learn programming preferably something like python, Scala or R (SAS, SAS work too). Learn stats/modeling. Understand basics of databases. Have experience that focuses on business needs or understanding business processes (mba will work here). Teach yourself to combine all of the above (read plenty of books, online courses).

  2. KK’s Background:

    Data Warehouse Developer (including 2 MPPs)
    Solutions Architect (BI systems)
    Enterprise Information Management Consultant
    Business Intelligence Analyst
    Quality Control/Computer Programmer
    Independent Consultant (Data Modeling)

    Data Science:
    Studying Statistics Modeling and comp sci (graduate school)
    Know R, Python, SQL and VBA
    Going to Big Data/Data Science meet-ups and talks 2-3 years.
    Read about (4-5 books on subject).

    Haven’t been:
    Data Engineer
    Dara Scientist

    Talk to two of my friends in data science regularly :). Keep trying to convert me to be data engineer.