Hi,i am writing blogs for our platform House of Bots on Artificial Intelligence, Machine Learning, Chatbots, Automation etc after completing my MBA degree. ...Full Bio
Hi,i am writing blogs for our platform House of Bots on Artificial Intelligence, Machine Learning, Chatbots, Automation etc after completing my MBA degree.
Data Science: A Team Spirit
20 days ago
Python Opens The Door For Computer Programming
How Machine Learning Can Apply to Event Processing
- Optimized pricing: Offer a product, hotel or flight for the best price, i.e. to make most revenue. It should be lower than the competition, but also not too low. If it is sold for too low a price, you create less revenue.
- Fraud detection: Decide if a payment is accepted or rejected, e.g. in a store or online sales, before the fraud has happened.
- Cross selling: Send an offer (e.g. via push message to the mobile device) while the customer is still in the store. If the customer has left the store, he will spend his money somewhere else.
- Rerouting transportation: Act on logistic events that are not foreseeable hours before - for example, traffic congestion due to weather or accidents.
- Customer service: Recommend the best solution for a customer while he is in the line, solve a problem in the appropriate way based on the customer's history, knowledge and loyalty status.
- Proactive maintenance: Stop and replace a part before the machine will (probably) break. This can save a lot of money and efforts.
- Restock inventory: Ship inventory to specific shops based on expectations, external influences, and customer behaviour.
- Data acquisition: Integrate relevant data sources. These can be CSV files, a relational database, a data warehouse, or Big Data storage such as a Hadoop cluster. Using these data sources should be very easy via graphical out-of-the-box connectors to let businesses focus on discovering business insights.
- Data preparation: Usually, you have to merge different data sets to get valuable insights. The real insights are found by (often unexpected) correlations of different data. The output of data preparation is often just flat files with rows and columns (like a simple CSV file, but often with very large data sets), These files can be used easily for extensive analysis.
- Exploratory data analysis: The business user uses the integrated and prepared data to spot problems and new insights. Ease-of-use of the tool is key for success (e.g. using a recommendation engine that suggests specific visualizations for selected columns, or interactive brush-linking between different tables and charts).
- Analytic model building: One size does not fit all! Many different machine learning algorithms are available, and the number is growing these days massively. A data scientist usually tries out different alternatives and repeats different approaches iteratively to find and create the best analytic model.
- Analytic model validation: This is a key for success. Is the model really working well, also with new inputs? After training a machine learning algorithm initially with some historical data, you have to use another part of the historical data to validate the model. Afterwards, you can either improve the model by changing variables, formulas, or by changing the complete algorithm. Even if the model is good enough to deploy for real-time event processing, it is still revalidated later with new data to improve it further.
- R Language: Well known as the most popular free and open-source programming language used by data scientists for modeling. It is developing very rapidly with a very active community. R comes from the academic world and was not build for enterprise production, scalability or high performance. Therefore, an enterprise product on top of R is often used instead of just deploying open-source R code for production with better performance. Another issue with R is the restrictive open-source GPL license, which might be a hindrance if you want to "resell some R code" within a product to other customers.
- Python: The other widespread language for machine learning. It emphasizes ease-of-use, productivity and code reliability. Therefore, its syntax is "nicer" and easier to learn than R, but its primary focus - in contrary to R - is not statistics and predictive analytics. Rather, it is a multi-purpose language in which machine learning is just a small part.
- Apache Spark: A general scalable data-processing framework, which includes machine learning, graph processing, SQL support and streaming features. However, the focus in most projects today is especially on analytics using its machine learning library, MLlib. It consists of common learning algorithms and utilities, as well as lower-level optimization primitives and higher-level pipeline APIs.
- H2O: An extensible open-source platform for analytics. It leverages best of breed open source technology such as Hadoop, Spark, R, and others. Its goal is to offer massively scalable Big Data analytics with very high performance for real-time scoring.
- Microsoft's Revolution Analytics: A statistical software company focused on developing open source and "open-core" versions (i.e. commercial add-ons and support) of the R language for enterprise, academic and analytics customers. Microsoft - who acquired it in 2015 - is rebranding the software and releasing new products and features.
- KNIME: An open-source data analytics, reporting, and integration platform. It integrates various components for machine learning and data mining through its modular data-pipelining concept. A graphical user interface allows assembly of nodes for data preprocessing for modeling, data analysis and visualization. It is often combined with languages such as R or other tools such as TIBCO Spotfire.
- SAS: Commercial vendor for building analytics models. One of the leaders in the market.
- IBM SPSS: Another leading product in the analytics market, offered by the mega-vendor IBM.
- Amazon Machine Learning: A managed Software-as-a-Service (SaaS) offering for building machine learning models and generating predictions. Integrated into the corresponding cloud ecosystem of Amazon Web Services. It's easy to use, but has a limited feature set and potential latency issues if combined with external data or applications.
- Machine learning offerings are also available from many other cloud vendors, e.g. Microsoft Azure, IBM Watson Analytics, or HP Haven on Demand.
- Machine-to-machine automation: Automated action based on analytic models of history combined with live context and business rules. The Challenge: Create, understand, and deploy algorithms and rules that automate key business reactions.
- Human interactions: Human decisions in real time informed by up-to-date information via pushed events. The challenge: Empower operations staff to see and seize key business moments.