Nand Kishor Contributor

Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc... ...

Full Bio 
Follow on

Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc...

3 Best Programming Languages For Internet of Things Development In 2018
258 days ago

Data science is the big draw in business schools
431 days ago

7 Effective Methods for Fitting a Liner
441 days ago

3 Thoughts on Why Deep Learning Works So Well
441 days ago

3 million at risk from the rise of robots
441 days ago

Top 10 Hot Artificial Intelligence (AI) Technologies
304023 views

Here's why so many data scientists are leaving their jobs
79572 views

Want to be a millionaire before you turn 25? Study artificial intelligence or machine learning
73719 views

2018 Data Science Interview Questions for Top Tech Companies
72861 views

Google announces scholarship program to train 1.3 lakh Indian developers in emerging technologies
60333 views

10 most popular Machine Learning Projects on Github

By Nand Kishor |Email | Mar 17, 2018 | 12237 Views

Github has become the goto source for all things open-source and contains tons of resource for Machine Learning practitioners. We bring to you a list of 10 Github repositories with most stars. We have not included the tutorial projects and have only restricted this list to projects and frameworks

1. Tensorflow
TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. TensorFlow also includes TensorBoard, a data visualization toolkit.

TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google‚??s Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research. The system is general enough to be applicable in a wide variety of other domains, as well.

2. scikit-learn
scikit-learn is a Python module for machine learning built on top of SciPy and distributed under the 3-Clause BSD license.

The project was started in 2007 by David Cournapeau as a Google Summer of Code project, and since then many volunteers have contributed. It is currently maintained by a team of volunteers.

3. Keras
Keras is a high-level neural networks API, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

Use Keras if you need a deep learning library that:
  • Allows for easy and fast prototyping (through user friendliness, modularity, and extensibility).
  • Supports both convolutional networks and recurrent networks, as well as combinations of the two.
  • Runs seamlessly on CPU and GPU.

4. Apache PredictionIO (incubating)
Apache PredictionIO (incubating) is an open source machine learning framework for developers, data scientists, and end users. It supports event collection, deployment of algorithms, evaluation, querying predictive results via REST APIs. It is based on scalable open source services like Hadoop, HBase (and other DBs), Elasticsearch, Spark and implements what is called a Lambda Architecture.

5. Tesseract
Tesseract has unicode (UTF-8) support, and can recognize more than 100 languages ‚??out of the box‚??. It can be trained to recognize other languages.

Tesseract supports various output formats: plain-text, hocr(html), pdf.

Tesseract was originally developed at Hewlett-Packard Laboratories Bristol and at Hewlett-Packard Co, Greeley Colorado between 1985 and 1994, with some more changes made in 1996 to port to Windows, and some C++izing in 1998.

In 2005 Tesseract was open sourced by HP. Since 2006 it is developed by Google.

6. MXNet
MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix symbolic and imperative programming to maximize efficiency and productivity. At its core, MXNet contains a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. MXNet is portable and lightweight, scaling effectively to multiple GPUs and multiple machines.

7. WaveFunctionCollapse
This program generates bitmaps that are locally similar to the input bitmap.

Local similarity means that

  • (C1) Each NxN pattern of pixels in the output should occur at least once in the input.
  • (Weak C2) Distribution of NxN patterns in the input should be similar to the distribution of NxN patterns over a sufficiently large number of outputs. In other words, probability to meet a particular pattern in the output should be close to the density of such patterns in the input.

8. Pattern
Pattern is a web mining module for Python. It has tools for:

  • Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM parser
  • Natural Language Processing: part-of-speech taggers, n-gram search, sentiment analysis, WordNet
  • Machine Learning: vector space model, clustering, classification (KNN, SVM, Perceptron)
  • Network Analysis: graph centrality and visualization.

It is well documented and bundled with 50+ examples and 350+ unit tests.

9.Natural Language Toolkit (NLTK)
NLTK - the Natural Language Toolkit - is a suite of open source Python modules, data sets and tutorials supporting research and development in Natural Language Processing.

10. Swift AI
Swift AI is a high-performance machine learning library written entirely in Swift.

Swift AI includes a set of common tools used for machine learning and artificial intelligence. These tools are designed to be flexible, powerful and suitable for a wide range of applications.

Source: AIM