Successful DevOps practices generate a huge amount of data, so it is unsurprising that data can be used for such things as streamlining workflows and monitoring in production, and diagnosis the bugs.
The problem is too much data. Server logs themselves can take up several hundred megabytes a week. If the group is using a monitoring tool, megabytes or even gigabytes of more data can be generated in a short period of time. And too much data has a predictable result: Teams don't look directly at the data, but rather set thresholds whereby a particular level of activity is believed to be problematic. In other words, even mature DevOps teams are looking for exceptions, rather than diving deeply into the data they've collected. That shouldn't be a surprise. Even with modern analytic tools, you have to know what you're looking for before you can start to make sense of it.
Following are the five ways to improve your DevOps practices by leveraging Machine Learning:
1. Stop looking at thresholds and start analyzing your data
Because there is so much data, DevOps teams rarely view and analyze the entire data set. Instead, they set thresholds, such as "X measures above a defined watermark," as a condition for action.
In effect, they are throwing out the vast majority of data they collect and focusing on outliers. The problem with that approach is that the outliers may alert, but they don't inform. Machine learning applications can do more. You can train them on all of the data, and once in production, those applications can look at everything that's coming in to determine a conclusion. This will help with predictive analytics.
2. Look for trends rather than faults
This follows from above. If you train on all of the data, your machine learning system can output more than simply problems that have already occurred. Instead, by looking at data trends below threshold levels, DevOps professionals can identify trends over time that may be significant.
3. Analyze and correlate across data sets when appropriate
Much of your data is time-series in nature, and it's easy to look at a single variable over time. But many trends come from the interactions of multiple measures. For example, response time may decline only when many transactions are doing the same thing at the same time.
These trends are virtually impossible to spot with the naked eye, or with traditional analytics. But properly trained machine learning applications are likely to tease out correlations and trends that you will never find using traditional methods.
4. Look at your development metrics in a new way
In all likelihood, you are collecting data on your delivery velocity, bug fix metrics, plus data generated from your continuous integration system. You might be curious, for example, to see if the number of integrations correlates with bugs found. The possibilities for looking at any combination of data are tremendous.
5. Provide a historical context for data
One of the biggest problems with DevOps is that we don't seem to learn from our mistakes. Even if we have an ongoing feedback strategy, we likely don't have much more than a wiki that describes problems we've encountered, and what we did to investigate them. All too often, the answer is that we rebooted our servers or restarted the application.
Machine learning systems can dissect the data to show clearly what happened over the last day, week, month, or year. It can look at seasonal trends or daily trends, and give us a picture of our application at any given moment.