Setting up your AI Dev Environment in 5 Minutes

By Kimberly Cook |Email | Aug 14, 2018 | 12627 Views

Whether you're a novice data science enthusiast setting up TensorFlow for the first time, or a seasoned AI engineer working with terabytes of data, getting your libraries, packages, and frameworks installed is always a struggle.

While containerization tools like Docker have truly revolutionized reproducibility in software, they haven't quite caught on yet in the data science and AI communities, and for good reason! With constantly evolving machine learning frameworks and algorithms, it can be tough to find time to dedicate towards learning another developer tool, especially one that isn't directly linked to the model building process.

In this blog post, I'm going to show you how you can use one simple python package to setup your environment for any of the popular data science and AI frameworks, using just a few simple steps. Datmo leverages Docker under the hood and streamlines the process to help you get running quickly and easily, without the steep learning curve.

0. Prerequisites
1. Install datmo
Just like any python package, we can install datmo from your terminal with the following:

$ pip install datmo
2. Initialize a datmo project
In your terminal, cd to the folder you want to start building models in. Then, enter the following command:

$ datmo init
You'll then be asked for a name and description for your project -- feel free to name it whatever you'd like!

3. Start environment setup
After a name and description, datmo will ask if you'd like to setup your environment -- type y and press enter.

4. Select System Drivers (CPU or GPU)
The CLI will then ask which system drivers you'd like for your environment. If you don't plan on using a GPU, choose cpu.

(1) gpu
(2) cpu
Please select one of the above environment type (e.g. 1 or gpu):

5. Select an environment
Next you'll choose from one of the many pre-packaged environments. Simply respond in the prompt with the number or ID of the environment you want to use.

(1) data-analytics : has libraries such as xgboost, lightgbm, sklearn etc.
(2) mxnet : has libraries for mxnet(v1.1.0) along with sklearn, opencv etc.
(3) caffe2 : has libraries for caffe2(v0.8.0) along with sklearn, opencv etc.
(4) keras-tensorflow : has libraries for keras(v2.1.6) and tensorflow(v1.9.0) along with sklearn, opencv etc.
(5) kaggle : has the environment provided by kaggle
(6) pytorch : has libraries for pytorch(v0.4.0) along with sklearn, opencv etc.
(7) python-base : has base python image with no libraries installed
(8) r-base : has base R image with no libraries installed. Use this environment for rstudio workspace

Please select one of the above environments (e.g. 1 or data-analytics):

6. Select a language version (if applicable)
Many of the environments above have different versions depending on which language and version you plan on using.

For example, after selecting the keras-tensorflow environment, I'd be faced with the following prompt asking whether I want to use Python 2.7 or Python 3.5.

(1) py27
(2) py35
Please select one of the above environment language (e.g. py27):

7. Launch your workspace
You've properly selected your environment, now it's time to launch your workspace. Choose the workspace you'd like to use, and enter it's respective command in your terminal.

Jupyter Notebook -- $ datmo notebook
JupyterLab -- $ datmo jupyterlab
RStudio -- $ datmo rstudio (available in R-base environment)
Terminal -- $ datmo terminal

You're set! The first time you initialize a workspace for a new environment, it will take a bit of time as it needs to fetch all of the resources, but it will be significantly faster in consecutive runs.

Once your workspace launches, you're good to start importing packages and frameworks that were included in the environment you chose! For example, if the user selected the keras-tensorflow environment, then import tensorflow will work out of the box in your Jupyter Notebook!

If you're using TensorFlow, you can try this example from our docs for running your first TensorFlow graph.

If you'd like to help contribute, report issues, or request features, you can find us on GitHub here!
Bio: Nick Walsh is a developer evangelist at software engineer at Datmo, building developer tools to help make data scientists more efficient. He also mentors at student hackathons across the country as a coach for Major League Hacking.

The article was originally published here

Source: HOB