High In-Demand Programming Language Frameworks & Tools For Machine Learning Engineers

By Kimberly Cook |Email | Feb 22, 2019 | 10263 Views

If you're wondering which of the growing suite of programming language libraries and tools are a good choice for implementing machine-learning models then help is at hand.

More than 1,300 people mainly working in the tech, finance, and healthcare revealed which machine-learning technologies they use at their firms, in a new O'Reilly survey.

The list is a mix of software frameworks and libraries for data science favorite Python, big data platforms, and cloud-based services that handle each stage of the machine-learning pipeline.

Most firms are still at the evaluation stage when it comes to using machine learning, or AI as the report refers to it, and the most common tools being implemented were those for 'model visualization' and 'automated model search and hyperparameter tuning'.
Unsurprisingly, the most common form of ML being used as supervised learning, where a machine-learning model is trained using large amounts of labeled data. For instance, a computer-vision model tasked with spotting people in the video might be trained on images annotated to indicate whether they contain a person.

Here are the libraries, frameworks, big data platforms, and cloud services that businesses say they're using for machine learning.

Software libraries and frameworks
TensorFlow
Google's widely used machine-learning framework, designed to handle the numerical computation demanded when training machine learning models and able to split calculations between CPUs, GPUs and specialized chips such as Google's Tensor Processing Units (TPUs).

scikit-learn
A popular Python library for data mining and data analysis that implements a wide-range of machine-learning algorithms.

Pytorch
An open-source, deep learning framework that has a reputation for being easier to learn than some competing frameworks like TensorFlow and that is designed to be used at each stage of the machine-learning pipeline.

Keras
A deep-learning framework for working with neural networks, the brain-inspired mathematical models that underpin deep learning, that is designed to be simpler for people to work with than competing frameworks.

Written in Python, it is capable of running on top of TensorFlow, Microsoft Cognitive Toolkit (CNTK), and the Python library Theano.

Cloud suites
Microsoft Azure ML Studio
This suite of services is designed to help firms build, train, and deploy machine-learning models, both on Microsoft's Azure cloud and also on computing devices close to the edge of the network. Tools help automate the process of identifying and tuning an appropriate machine-learning model, as well as with scaling the underlying computer to match demand.

Google Cloud ML Engine
Similar to Azure ML Studio, Google Cloud ML Engine also provides tools for training, evaluating, tuning, and deploying machine-learning models.

Amazon SageMaker
Amazon SageMaker similarly offers services for building, training, and deploying machine-learning models, with a view to making it possible to get models to production more rapidly and at a lower cost.
Big data platform tools
H20
An open-source, in-memory platform that can scale machine-learning workloads across distributed systems.

The platform is designed to support the most widely used statistical and machine-learning algorithms and also offers a degree of automation to help data scientists identify and tune appropriate machine-learning models.

Prodigy
Designed to streamline the process of training and evaluating machine-learning models, Prodigy is a tool for helping data scientists annotate training datasets appropriately.

Spark NLP
Spark NLP provides a Natural Language Processing (NLP) library designed to work with distributed systems running the in-memory, big-data platform Apache Spark.

OpenAI Gym
Described as a toolkit for developing and comparing algorithms for reinforcement-learning, a type of machine learning where software agents learn how to perform tasks by being rewarded for actions that result in the desired outcome.

Analytics Zoo
Analytics Zoo brings together a series of big data and machine-learning technologies into what it describes as a unified analytics and AI platform.

The platform integrates Spark, TensorFlow, Keras, and the deep learning library BigDL, and can scale machine-learning models across distributed Hadoop and Spark clusters for training and inference.

AllenNLP
Designed to simplify the process of designing and evaluating new deep-learning models for Natural Language Processing problems.

The library includes reference implementations of high-quality models for both core NLP problems and NLP applications.

Rise Lab Ray
A framework for running machine learning models across distributed systems, offering both high performance and fault tolerance, while still being scalable.

Source: HOB