Here's a bold prediction for you: machine learning is NOT going to take over the computer science jobs, but computer science will automate machine learning jobs.
Well, maybe after I explain what I mean it won't seem so (figuratively) bold.
You see, most of what we call applied machine learning today is actually a relatively unglamorous meta-optimization problem. We're trying to explore the space of feature representations, sampling strategies, hyperparameters, model types, and model configurations to get the best performance on our test dataset.
In practice, this process can best be described as guesstimation: you try one combination of all these different variables, you see how the model does, then you think "well, the model did poorly on X performance metric, so let's try changing variable Y". And this process basically continues in a loop until you're satisfied with the performance of your model.
In some ways, the process is so well-defined that it practically begs to be automated. And already we've seen a lot of progress on this front through tools like AutoML
that allow people with little-to-no machine learning expertise to build complex machine learning models. So, already in the span of a few years we've made significant progress in "democratizing" or "automating" the machine learning process, and yet in decades and decades of effort we've done little to move the needle on automating software development. Hmm...
Now, this isn't to say that there aren't significant challenges in solving a real-world problem with machine learning, but in large part those challenges are orthogonal to the actual machine learning modeling process I described above. The hardest thing about machine learning in industry is (1) figuring out what the right data is to solve the problem, and (2) figuring out how to integrate a working model with a production system. Both of those require domain expertise, and the solutions are specific to the individual problem being solved. In other words, there's no easy path towards automation. But they both require talented data scientists (for the former) and talented software engineers (for the latter) to solve.
So despite whatever Mark Cuban or anyone else is saying these days, software engineering and computer science are here to stay. However, don't be surprised if knowing how to code an LSTM in TensorFlow isn't as hot a skill in a few years.
The article was originally published here