Frustrations of the data scientist!
, I am a data scientist and yes
, you did read the title correctly, but someone had to say it. We read so many stories about data science being the sexiest job of the 21st century
and the attractive sums of money that you can make as a data scientist
that it can seem like the absolute dream job. Factor in that the field contains an abundance of highly skilled people geeking out to solve complex problems (yes it's a positive thing to "geek out"), there is everything to love about the job.
But the truth is that data scientists typically "spend 1-2 hours a week looking for a new job" as stated in this article by the Financial Times
. Furthermore, the article also states that "Machine learning specialists topped its list of developers who said they were looking for a new job, at 14.3 per cent. Data scientists were a close second, at 13.2 per cent." These data were collected by Stack Overflow in their survey based on 64,000 developers.
I too have been in that position and have recently switched data science jobs myself.
So why are so many data scientists looking for new jobs?
Before I answer that question I should clarify that I am still a data scientist. On the whole, I love the job and I don't want to discourage others from aspiring to be data scientists because it can be fun, stimulating and rewarding. The aim of this article is to play devil's advocate and expose some of the negative aspects of the job.
From my perspective, here are 4 big reasons why I think many data scientists are dissatisfied with their jobs.
1. Expectation does not match reality
Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it...‚??-‚??Dan Ariely
This quote is so apt. Many junior data scientists I know (this includes myself) wanted to get into data science because it was all about solving complex problems with cool new machine learning algorithms that make huge impact on a business. This was a chance to feel like the work we were doing was more important than anything we've done before. However, this is often not the case.
In my opinion, the fact that expectation does not match reality is the ultimate reason why many data scientists leave. There are many reasons for this and I can't possibly come up with an exhaustive list but this post is essentially a list of some of the reasons that I encountered.
Every company is different so I can't speak for them all but many companies hire data scientists without a suitable infrastructure in place
to start getting value out of AI. This contributes to the cold start problem in AI. Couple this with the fact that these companies fail to hire senior/experienced data practitioners before hiring juniors, you've now got a recipe for a disillusioned and unhappy relationship for both parties. The data scientist likely came in to write smart machine learning algorithms to drive insight but can't do this because their first job is to sort out the data infrastructure and/or create analytic reports. In contrast, the company only wanted a chart that they could present in their board meeting each day. The company then get frustrated because they don't see value being driven quickly enough and all of this leads to the data scientist being unhappy in their role.
It's important to evaluate how well our aspirations align with the critical path of the environment we are in. Find projects, teams, and companies whose critical path best aligned with yours.
This highlights the 2-way relationship between the employer and the data scientist. If the company isn't in the right place or has goals aligned with that of the data scientist then it'll only be a matter of time before the data scientist will find something else.
For those that are interested Samson Hu
has a fantastic series on how the analytics team was built at Wish which I also found very insightful.
Another reason that data scientists are disillusioned is a similar reason to why I was disillusioned which academia: I believed that I would be able to make a huge impact on people everywhere, not just within the company. In reality, if the company's core business is not machine learning (my previous employer is a media publishing company), it's likely that the data science that you do is only going to provide small incremental gains. These can add up to something very significant or you may be lucky to stumble on a gold mine project but this is less common.
2. Politics reigns supreme
When I was waking up at 6 AM to study Support Vector Machines I thought: "This is really tough! But, hey, at least I will become very valuable for my future employer!". If I could get the DeLorean, I would go back in time and call "Bulls**t!" on myself.
If you seriously think that knowing lots of machine learning algorithms will make you the most valuable data scientist then go back to my first point above: expectation does not match reality.
The truth is the people in the business with the most clout need to have a good perception of you. That may mean that you have to constantly do ad hoc work such as getting numbers from a database to give to the right people at the right time, doing simple projects just so that the right people have the right perception of you. I had to do this a lot in my previous place. As frustrating as it can feel, it was a necessary part of the job.
3) You're the go to person about anything data
Following on from doing anything to please the right people, those very same people with all of the clout often don't understand what is meant by "data scientist". This means that you'll be the analytics expert as well as the go-to reporting guy and let's not forget that you'll be the database expert too.
It isn't just non-technical executives that make too many assumptions about your skills. Other colleagues in technology assume you know everything data related. You know your way around Spark, Hadoop, Hive, Pig, SQL, Neo4J, MySQL, Python, R, Scala, Tensorflow, A/B Testing, NLP, anything machine learning (and anything else data related that you can think of‚??-‚??BTW if you see a job specification with all of these written on it, stay well clear. It reeks of a job spec from a company that has no idea what their data strategy is and they'll hire anyone because they think that hiring any data person will fix all of their data problems).
But it doesn't stop there. Because you know all of this and you obviously have access to ALL of the data, you are expected to have the answers to ALL of the questions by... well, it should've landed in the relevant person's inbox 5 minutes ago.
Trying to tell everyone what you actually know and have control of can be hard. Not because anyone will actually think any less of you, but because as a junior data scientist with little industry experience you'll worry that people will think less of you. This can be quite a difficult situation.
4) Working in an isolated team
When we see successful data products we often see expertly designed user interfaces with intelligent capabilities and most importantly, a useful output which, at the very least, is perceived by the users to solve a pertinent problem. Now if a data scientist spends their time only learning how to write and execute machine learning algorithms, then they can only be a small (albeit necessary) part of a team that leads to the success of a project that produces a valuable product. This means that data science teams that work in isolation will struggle to provide value!
Despite this, many companies still have data science teams that come up with their own projects and write code to try and solve a problem. In some cases this can suffice. For example, if all that's needed is a static spreadsheet that is produced once a quarter then it can provide some value. On the other hand, if the goal is to optimize provide intelligent suggestions in a bespoke website building product then this will involve many different skills which shouldn't be expected for the vast majority of data scientists (only the true data science unicorn can solve this one). So if the project is taken on by an isolated data science team it is most likely to fail (or take a very long time because organizing isolated teams to work on collaborative project in large enterprises is not easy).
So to be an effective data scientist in industry it doesn't suffice just to do well in Kaggle competitions and complete some online courses. It (un)fortunately (depending on which way you look at it) involves understanding how hierarchies and politics works in business. Finding a company that is aligned with your critical path should be a key goal when searching for a data science job that will satisfy your needs. However, you may still need to readjust your expectations of what to expect from a data science role.
If anyone has any additional comments, questions or objections, please feel free to comment because constructive discussion is necessary to help aspiring data scientists make well-informed decisions about their career path.
I hope I haven't put you off the job.
Thank you for reading :)