Big Data Analytics
The scientific process of transforming data into insight for making better decisions.
~INFORMS
Analytics leverage data in a particular functional process (or application) to enable
context-specific insight that is actionable
~Gartner
What is Analytics?
SIMPLY PUT->
CRM
ETL
Data
Quality
Normalised
Data
DATA
WAREH
OUSE
ERP
FINANCE
Business Administrator
Business Analyst
Business User
Traditional Data Analytics
Source :Wikibon 2011
BIG DATA “Big data is the term increasingly used to describe the process of
applying serious computing power—the latest in machine
learning and artificial intelligence—to seriously massive and often
highly complex sets of information.”
Big data opportunities emerge in organizations generating a
median of 300 terabytes of data a week. The most common forms
of data analysed in this way are business transactions stored in
relational databases, followed by documents, e-mail, sensor data,
blogs, and social media
Every day, we create 2.5 quintillion bytes of data — so much
that 90% of the data in the world today has been created in the
last two years alone. This data comes from everywhere: sensors
used to gather climate information, posts to social media sites,
digital pictures and videos, purchase transaction records, and
cell phone GPS signals to name a few.
This data is “big data.”
Types Of Data
Structured:Can easily fit
rows and columns of a
database
Unstructured: Cannot
be easily compiled into
older database formats
SemiStructured:Uses
tags to capture
elements of data
Internal Data:From a
company’s sales,
employee records etc
External Data: From
third party providers,
social media etc
Traditional Data BIG DATA
Gigabytes to Terabytes Petabytes to exabytes
centralised Distributed
Structured Semi Structured and
Unstructured
Stable Data Model Flat Schemas
Known Complex
interrelationships
Few Complex
Interrelationships
1,000,000,000,000 Gigabytes
1,000,000,000 Terabytes
1,000,000 Petabytes
1,000 Exabytes
1 Zettabyte
5.1 10.2
16.8
32.1
48
53.4
0
20
40
60
2012 2013 2014 2015 2016 2017
Big Data Market Forecast(In
$US billions)
Sources of Big Data
Source :Wikibon 2011
Big Data
Use of Big Data
Big data refers to enormity in five dimensions:
Big
data
VOLUME
VARIETY
VELOCITYVARIABILITY
COMPLEXITY
Analysis Types Description
Basic Analytics for
insight
Slicing and Dicing of data,
reporting,simple,visualisations,basic
monitoring
Advanced
Analytics for
Insight
More complex data analysis such as
predictive modelling and other pattern
matching techniques
Operationalised
Analytics
Analytics becomes part of the business
process
Monetised
Analytics
Analytics used directly to drive revenue
BIG DATA
ANALYTICS
Source :The Economist
Basic analytics can be used to explore your data, if you’re not
sure what you have, but you think something is of value.
Slicing and Dicing-Breaking down data into smaller sets of
data that are easier to explore.
Basic Monitoring-Monitor large volumes of data in real time
Anomaly identification-An event where the actual
observation differs from what is expected.
Basic Analytics
Advanced Analytics
Advanced analytics can be deployed to find patterns in data,
prediction, forecasting, and complex event processing.
Advanced analytics provides algorithms for complex analysis
of either structured or unstructured data
Includes sophisticated statistical models, machine learning,
neural networks, text analytics and other advanced data-
mining techniques
Predictive Analytics-Techniques that can be used on both
structured and unstructured data (together or individually)to
determine future outcomes.
Text Analytics-the process of analyzing unstructured text,
extracting relevant information, and transforming it into
structured information
Other Methods-advanced forecasting, optimization, cluster
analysis for segmentation or even microsegmentation, or affinity
analysis
Data Mining-exploring and analysing large amounts of data to
find patterns in that data.
Overview on Big Data Market Segment
Source :Wikibon 2011
Hadoop:Hadoop is an open source framework for processing, storing and analyzing
massive amounts of distributed, unstructured data. Rather than banging away at
one, huge block of data with a single machine, Hadoop breaks up Big Data into
multiple parts so each part can be processed and analyzed at the same time.
NoSQL:NoSQL databases are aimed, for the most part (though there are some
important exceptions) at serving up discrete data stored among large volumes of
multi-structured data to end-user and automated Big Data applications.
Massively Parallel Analytic Databases:Unlike traditional data warehouses,
massively parallel analytic databases are capable of quickly ingesting large
amounts of mainly structured data with minimal data modeling required and can
scale-out to accommodate multiple terabytes and sometimes petabytes of data.
Big Data Approaches
Big Data Analytics MIS presentation

Big Data Analytics MIS presentation

  • 1.
  • 2.
    The scientific processof transforming data into insight for making better decisions. ~INFORMS Analytics leverage data in a particular functional process (or application) to enable context-specific insight that is actionable ~Gartner What is Analytics? SIMPLY PUT->
  • 3.
  • 4.
    BIG DATA “Bigdata is the term increasingly used to describe the process of applying serious computing power—the latest in machine learning and artificial intelligence—to seriously massive and often highly complex sets of information.” Big data opportunities emerge in organizations generating a median of 300 terabytes of data a week. The most common forms of data analysed in this way are business transactions stored in relational databases, followed by documents, e-mail, sensor data, blogs, and social media Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. This data is “big data.”
  • 5.
    Types Of Data Structured:Caneasily fit rows and columns of a database Unstructured: Cannot be easily compiled into older database formats SemiStructured:Uses tags to capture elements of data Internal Data:From a company’s sales, employee records etc External Data: From third party providers, social media etc
  • 6.
    Traditional Data BIGDATA Gigabytes to Terabytes Petabytes to exabytes centralised Distributed Structured Semi Structured and Unstructured Stable Data Model Flat Schemas Known Complex interrelationships Few Complex Interrelationships 1,000,000,000,000 Gigabytes 1,000,000,000 Terabytes 1,000,000 Petabytes 1,000 Exabytes 1 Zettabyte 5.1 10.2 16.8 32.1 48 53.4 0 20 40 60 2012 2013 2014 2015 2016 2017 Big Data Market Forecast(In $US billions) Sources of Big Data Source :Wikibon 2011
  • 7.
  • 8.
    Big data refersto enormity in five dimensions: Big data VOLUME VARIETY VELOCITYVARIABILITY COMPLEXITY
  • 9.
    Analysis Types Description BasicAnalytics for insight Slicing and Dicing of data, reporting,simple,visualisations,basic monitoring Advanced Analytics for Insight More complex data analysis such as predictive modelling and other pattern matching techniques Operationalised Analytics Analytics becomes part of the business process Monetised Analytics Analytics used directly to drive revenue BIG DATA ANALYTICS Source :The Economist
  • 10.
    Basic analytics canbe used to explore your data, if you’re not sure what you have, but you think something is of value. Slicing and Dicing-Breaking down data into smaller sets of data that are easier to explore. Basic Monitoring-Monitor large volumes of data in real time Anomaly identification-An event where the actual observation differs from what is expected. Basic Analytics
  • 11.
    Advanced Analytics Advanced analyticscan be deployed to find patterns in data, prediction, forecasting, and complex event processing. Advanced analytics provides algorithms for complex analysis of either structured or unstructured data Includes sophisticated statistical models, machine learning, neural networks, text analytics and other advanced data- mining techniques
  • 12.
    Predictive Analytics-Techniques thatcan be used on both structured and unstructured data (together or individually)to determine future outcomes. Text Analytics-the process of analyzing unstructured text, extracting relevant information, and transforming it into structured information Other Methods-advanced forecasting, optimization, cluster analysis for segmentation or even microsegmentation, or affinity analysis Data Mining-exploring and analysing large amounts of data to find patterns in that data.
  • 13.
    Overview on BigData Market Segment Source :Wikibon 2011
  • 14.
    Hadoop:Hadoop is anopen source framework for processing, storing and analyzing massive amounts of distributed, unstructured data. Rather than banging away at one, huge block of data with a single machine, Hadoop breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time. NoSQL:NoSQL databases are aimed, for the most part (though there are some important exceptions) at serving up discrete data stored among large volumes of multi-structured data to end-user and automated Big Data applications. Massively Parallel Analytic Databases:Unlike traditional data warehouses, massively parallel analytic databases are capable of quickly ingesting large amounts of mainly structured data with minimal data modeling required and can scale-out to accommodate multiple terabytes and sometimes petabytes of data. Big Data Approaches