Nand Kishor Contributor

Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc... ...

Full Bio 
Follow on

Nand Kishor is the Product Manager of House of Bots. After finishing his studies in computer science, he ideated & re-launched Real Estate Business Intelligence Tool, where he created one of the leading Business Intelligence Tool for property price analysis in 2012. He also writes, research and sharing knowledge about Artificial Intelligence (AI), Machine Learning (ML), Data Science, Big Data, Python Language etc...

3 Best Programming Languages For Internet of Things Development In 2018
341 days ago

Data science is the big draw in business schools
514 days ago

7 Effective Methods for Fitting a Liner
524 days ago

3 Thoughts on Why Deep Learning Works So Well
524 days ago

3 million at risk from the rise of robots
524 days ago

Top 10 Hot Artificial Intelligence (AI) Technologies
310326 views

Here's why so many data scientists are leaving their jobs
80724 views

2018 Data Science Interview Questions for Top Tech Companies
76512 views

Want to be a millionaire before you turn 25? Study artificial intelligence or machine learning
75978 views

Google announces scholarship program to train 1.3 lakh Indian developers in emerging technologies
61332 views

5 Things You Need to Know about Big Data

By Nand Kishor |Email | Mar 21, 2018 | 27681 Views

There's a lot of social media and general internet buzz regarding Big Data, but what exactly is it? Here are 5 interesting things to know about Big Data.

1. What is it?
Simply put, Big Data refers to large data sets that are computationally analysed to reveal patterns and trends relating to a certain aspect of the data. There's no minimum amount of data needed for it to be categorised as Big Data, as long as there's enough to draw solid conclusions.
M-Brain explains the different facets of Big Data through the 8 V's.

Fig. 1: M-Brain - Big Data with 8 V's

2. How can I access Big Data?
Big Data is available in an endless number of places and it's only increasing as time goes on. A simple Google search will enable you to find a data repository for just about everything. A lot of people aren't aware of just how much data is already available for access and analysis. KD Nuggets has an extensive list of Datasets for Data Mining and Data Science available here - https://www.kdnuggets.com/datasets/index.html

How you can access and utilise this data can be split into six parts:

Data Extraction
Before anything happens, some data is needed. This can be gained in a number of ways, normally via an API call to a company's web service.

Data Storage
The main difficulty with Big Data is managing how it will be stored. It all depends on the budget and expertise of the individual responsible for setting up the data storage as most providers will require some programming knowledge to implement. A good provider should allow you a safe, straight-forward place to store and query your data.

Data Cleaning
Like it or not, data sets come in all shapes and sizes. Before you can even think about how the data will be stored, you need to make sure it is in a clean and acceptable format.

Data Mining
Data mining is the process of discovering insights within a database. The aim of this is to provide predictions and make decisions based on the data currently held.

Data Analysis
Once all the data has been collected it needs to be analysed to look for interesting patterns and trends. A good data analyst will spot something out of the ordinary, or something that hasn't been reported by anyone else.

Data Visualisation
Perhaps the most important is the visualisation of the data. This is the part that takes all the work done prior and outputs a visualisation that ideally anyone can understand. This can be done using programming languages such as Plot.ly and d3.js or software such as Tableau.

3. Are there careers related to Big Data?
With the growing access to Big Data, it should come as no surprise that the volume of careers related is on the rise as well. According to the Data Motion, a Big Data Engineer would earn an average salary of $150,000 a year.

Fig. 2: Top 10 Big Data Jobs

It's worth noting that 88% of Data Scientists have an MSc, making it a passport to get into any job in this field (https://www.burtchworks.com/files/2014/07/Burtch-Works-Study_DS_final.pdf).

4. Is it a growing industry?
In short, yes. The general interest and access to Big Data is on the rise. This Google Trends chart (https://g.co/trends/pxXJa) shows the increase in popularity of the search term ??Big Data' between 2004 and the present day.

Fig. 3: Google Trends for Big Data, 2004-2018
According to IDC, "Worldwide revenues for big data and business analytics (BDA) will reach $150.8 billion in 2017, an increase of 12.4 percent over 2016". The company goes onto estimate that by 2020, big data revenues could top $210 billion.

5. How do I learn more?
Big Data is a broad subject, so learning it all requires knowledge of several areas. Someone looking to work in the field would need an array of certain skills, including one or more of the following:

  • A knowledge of a programming language that relates to data analysis, namely R, Python, SAS or SQL
  • A good understanding of Maths and Statistics
  • Experience on how to scrape a webpage
  • Basic Excel skills

Websites such as Coursera (https://www.coursera.org/specializations/big-data) and Simpli Learn (https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training) offer online Big Data courses.

If you're looking for a University course, Masters Portal (www.mastersportal.eu/study-options/268927258/data-science-big-data-united-kingdom.html) lists 95 Masters Degrees in Data Science & Big Data in the UK. A typical syllabus (www.stir.ac.uk/postgraduate/programme-information/prospectus/computing-science-and-mathematics/bigdata/) might involve:

  • Mathematics for Big Data
  • Pythonscripting
  • Business and scientific applications of Big Data
  • Big databases and NoSQL including MongoDB, Cassandra and Neo4J
  • Analytics, machine learning and data visualisation using Weka, R and scikit-Learn
  • Optimisation and heuristics for big problems
  • Cluster computing with Hadoop, Spark, Hive and MapReduce

Source: Kdnugget