Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
The document provides an overview of data science, emphasizing its role in extracting knowledge and insights from data using scientific methods for various applications. It details various data science tools for data manipulation, exploration, storage, visualization, and model building, highlighting their user-friendly features and integration capabilities with systems like Hadoop. These tools are designed to enable efficient data analysis and machine learning without extensive programming knowledge.
Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
2.
INTRODUCTION TO DATASCIENCE
DATA SCIENCE TOOLS
DATA SCIENCE TOOLS FOR DATA MANIPULATION
DATA SCIENCE TOOLS FOR EDA
www.edureka.co
DATA SCIENCE TOOLS FOR DATA STORAGE
DATA SCIENCE TOOLS FOR DATA VISUALIZATION
Introduction To DataScience
www.edureka.co
Data Science is the process of extracting knowledge and insights from data by
using scientific methods.
Data Science involves collecting, analysing and modelling data to solve real-world problems. It is
used for fraud detection, disease detection, recommendation engines and so on.
Data Science Toolscome with pre-defined functions, algorithms, and a very user-friendly GUI.
Hence, they can be used to build convoluted Machine Learning models without the use of a
programming language.
DATA SCIENCE TOOLS
Data Science
Data Collection
Exploratory Data Analysis
Data Modelling
Data Visualization
www.edureka.co
Scale and managemassive
amounts of data
Hadoop Distributed File System
(HDFS) for data storage
Integrate with , Hadoop
MapReduce, Hadoop YARN
www.edureka.co
9.
Data processing viaApache
Hadoop and Spark clusters
The default storage system is
Windows Azure Blob
Provides Microsoft R Server
www.edureka.co
Data Integration toolbased on
Extract Transform Load architecture
Extract Transform Load tool
to manage data
Support for distributed processing, grid
computing, adaptive load balancing.
www.edureka.co
12.
Data processing, building
MachineLearning models, etc
Support for integrating Hadoop
framework
Generate predictive models
through automated modelling
www.edureka.co
Easy to applyMachine Learning
Supports GLM, Boosting ML models
& Deep Learning
Support to integrate with Apache
Hadoop
www.edureka.co
15.
Supports parallel programmingto
perform data analysis, data
modelling, etc
Tests and trains Machine Learning
models at lightning fast speed
Makes model evaluation much
easier.
www.edureka.co
Can visualize massivedata sets to find
correlations and patterns
Create customized reports and
dashboards
Support to integrate with Apache
Hadoop
www.edureka.co
18.
Clear & concisevisualizations
Supports in-memory data
processing
Automatically generates data
associations
www.edureka.co