N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
N-U Sigma U2 Analytics Lab web: www.businessanlyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093
Dr. Umesh R Hodeghatta
Application of Machine Learning in Cyber Security
October 9th, 2019
Artificial Intelligence (AI)
Machine Learning
Deep Learning
C
I
A
Confidentiality
Integrity
Availability
Outline
2
 Information Security
 Applying Machine Learning Techniques
 Cybersecurity Applications
 Machine Learning
 Case Study - Predicting Phishing Attack
 Summary
 Q & A
Denial of Service
Loss of IntegrityBankCustomer
Deposit $1000 Deposit $ 100
Security Threats
Loss of Privacy
m-y-p-a-s-s-w-o-r-d d-a-n
telnet company.org
username: dan
password:
Impersonation
I’m Bob.
Send Me All Corporate
Correspondence
with Cisco.
Bob
CPU
3
C
I
A
Confidentiality
Integrity
Availability
Information Security/Cyber Security
Implementing Information Security
Risk
Assessment
Planning &
Architecture
Gap Analysis
Integration &
Deployment
Operations
Legal
Compliance
And
Audit
Crisis
Management
Continuous
Monitoring
Implementing Information Security
Risk
Assessment
Planning &
Architecture
Gap Analysis
Integration &
Deployment
Operations
Legal
Compliance
And
Audit
Crisis
Management
Continuous
Monitoring
&
Learning
DATA
Detection
Correction
Prevention
• Servers
• Database
• Network Devices
- Firewall/IDS/IPS/
- Routers/switches
• Endpoint devices
Machine Learning
7
Detection/Monitoring Prevention Correction
Descriptive Analytics Predictive Analytics Prescriptive Analytics
Information Security
Analytics/
Machine Learning
• Detect Incidents
• Monitor Traffic
• Monitor Events
• Prevent Attacks
• Prevent Incidents
• Isolate Systems
Predict Attacks
Predict Risks
Predict
Vulnerabilities
Applying Machine Learning Techniques
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
8
 Network Level (Router, Switches, Firewall, IDS/IPS, Cloud)
 Endpoint (server, mobile, desktops, IoT)
 User Level (Authentication, Social behavior, domain)
 Application Level (Web, Applications, Database, ERP)
 Process Level (Industry process and standards)
Machine Learning
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
10
Method
 Supervised Machine
Learning
 Unsupervised Machine
Learning
 Reinforcement Learning
Tasks
 Regression
 Classification
 Clustering
 Association Rule
Other ML
Terms/Tasks
• Dimensionality
Reduction
• Discriminant Analysis
• Regularization, LASSO
• Boosting
• Generative Models
• Deep Learning
Descriptive Analytics
Descriptive Analytics
 Information and Awareness
 Recording Security Breach – how, why and when
 Monitoring
 Provide statistics
 Type of attacks
 Type of breaches
 Regions
12
Descriptive Anaytics - Examples
13
Total ransomware
Total Malware
Reference: Symantec 2019 Report
Ref: www.Symantec.com reports
Quiz 1
Is Data Visualization Machine Learning?
 A. TRUE
 B. FALSE
Predicting Future
Email Classification
 Categories as malware, spyware and ransomware
16
Machine
Learning Model
SPAM
Not SPAM
Predicting Fraud
 Determine a probability of fraudulent actions.
 Patterns of suspicious transactions,
 Suspicious Users
 Suspicious locations/hackers
 Predict/Classify different types of network attacks
 Spoofing, Phishing, TCP policy violations, etc
17
Network Behaviour
 Predicting network traffic behaviour
 Source (remote) IP address
 Open TCP port
 Packet content
 Packet size
 Or any of the hundreds of different attributes that network traffic can have
 Predict the next packet parameters
18
End point security
 Predict known types of attacks
 SQLi, XSS, etc.
 DDOS attacks
 Find Pattern of user activity
 On Social Media
 Servers/Database/Web access
 Authentications
 Detect anomalies in HTTP requests (auth failures or bypass proxies
or firewalls)
19
Machine Learning Models
 Classification types of attacks
 Exploits, Reconnaissance, DOS, Policy Violations
 Predict user behavior
 SIEM logs
20
Unsupervised Machine Learning (Clustering)
 Clustering of threat patterns on a network
 Clustering security risks/Security Incidents
 Clustering of user activity
 Cluster User groups
 Clustering Web traffic data
 Clustering Vulnerabilities/Segments
21
Quiz 2
Classifying email as SPAM or NOT-SPAM is an example of:
1. Supervised Machine Learning
2. Unsupervised Machine Learning
3. My company machine learning
4. Association of Machine Learning
PREDICTING PHISHING ATTACK S
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
Case Study
23
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
24
Dataset Reference: Canadian Institute for Cybersecurity; https://www.unb.ca/cic/datasets/url-2016.html
Exploring DATA
Data
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
25
 80 different parameters
collected:
 1000 Data records
 Response Variable/Predictor:
BENIGN or PHISHING
Dataset Reference: Canadian Institute for Cybersecurity; https://www.unb.ca/cic/datasets/url-2016.html
Data Science /Machine Learning Framework
Requirements
DATA
Science
Deploy
Classifying PHISHING/BENIGN
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
27
 Applied Neural Network
Results
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
28
Dataset Reference: Canadian Institute for Cybersecurity; https://www.unb.ca/cic/datasets/url-2016.html
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
29
CORRECT I ON
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
Prescriptive Analytics
30
Prescriptive Analytics
 Automatically assign risk values for new vulnerabilities or
misconfigurations
 Automatically close inbox upond detection of ransomware attack
 Identification of specific threats and creating controls to counter
them
 Security patches
31
Summary
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
32
 Protecting data is critical to organization success
 Cyber security crime is increasing day by day
 Hackers are becoming smart
 AI and Machine Learning are new technologies to prevent frauds
by predicting future cyber attacks
NU-Sigma U2 Analytics Labs
 AI and Machine Learning Solutions
 Enable organizations with AI and Machine Learning Technology
 We have implemented projects for retail industry, Telecom, Healthcare and HR
organizations
 Conduct workshops: http://www.businessanalyticsr.com
 BrightTalk channel:
 https://www.brighttalk.com/channel/16781/umesh-hodeghatta
Reference
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093
 Business Analytics Using R; Dr. Umesh Hodeghatta and Umesha
Nayak, Springer Apress, USA, 2016
 Infosec Handbook: Introduction to Information Security; Dr.
Umesh Rao Hodeghatta and Umesha Nayak, Springer Apress, 2014
 Almseidin, M., Alzubi, M., Kovacs, S., & Alkasassbeh, M. (2017,
September). Evaluation of machine learning algorithms for
intrusion detection system. In 2017 IEEE 15th International
Symposium on Intelligent Systems and Informatics (SISY) (pp.
000277-000282). IEEE.
 Zamani, M., & Movahedi, M. (2013). Machine learning techniques
for intrusion detection. arXiv preprint arXiv:1312.2177.
34
References
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
35
 Juvonen, A., & Sipola, T. (2014). Anomaly Detection Framework Using Rule Extraction for
Efficient Intrusion Detection. arXiv preprint arXiv:1410.7709.
 Sun, L., Versteeg, S., Boztas, S., & Rao, A. (2016). Detecting anomalous user behavior using an
extended isolation forest algorithm: an enterprise case study. arXiv preprint
arXiv:1609.06676.
 Mohammad Saiful Islam Mamun, Mohammad Ahmad Rathore, Arash Habibi Lashkari,
Natalia Stakhanova and Ali A. Ghorbani, "Detecting Malicious URLs Using Lexical Analysis",
Network and System Security, Springer International Publishing, P467--482, 2016.
 Shah, S. A. R., & Issac, B. (2018). Performance comparison of intrusion detection systems and
application of machine learning to Snort system. Future Generation Computer Systems, 80,
157-170.
 Radford, B. J., Richardson, B. D., & Davis, S. E. (2018). Sequence aggregation rules for anomaly
detection in computer network traffic. arXiv preprint arXiv:1805.03735.
Reference (contd..)
N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com
email: umesh@businessanalyticsr.com Ph: +1 408757 0093
36
 Tuor, A., Kaplan, S., Hutchinson, B., Nichols, N., & Robinson, S. (2017, March). Deep
learning for unsupervised insider threat detection in structured cybersecurity data streams.
In Workshops at the Thirty-First AAAI Conference on Artificial Intelligence.
 Thi, N. N., & Le-Khac, N. A. (2017). One-class collective anomaly detection based on lstm-
rnns. In Transactions on Large-Scale Data-and Knowledge-Centered Systems XXXVI (pp.
73-85). Springer, Berlin, Heidelberg.
 Radford, B. J., Apolonio, L. M., Trias, A. J., & Simpson, J. A. (2018). Network traffic
anomaly detection using recurrent neural networks. arXiv preprint arXiv:1803.10769.
 Le, Q., Boydell, O., Mac Namee, B., & Scanlon, M. (2018). Deep learning at the shallow end:
Malware classification for non-domain experts. Digital Investigation, 26, S118-S126.
 Glander, S. (2017). Autoencoders and anomaly detection with machine learning in fraud
analytics. shiring. github. io/machine_learning/2017/05/01/fraud.
 Lotfollahi, M., Siavoshani, M. J., Zade, R. S. H., & Saberian, M. (2017). Deep packet: A novel
approach for encrypted traffic classification using deep learning. Soft Computing, 1-14.
THANK YOU
WEB: WWW.BUSINESSANALYTICSR.COM
UMESH@BUSINESSANALYTICSR.COM
PH: +1 408 757 0093

Application of Machine Learning in Cyber Security

  • 1.
    N-U Sigma U2Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 N-U Sigma U2 Analytics Lab web: www.businessanlyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 Dr. Umesh R Hodeghatta Application of Machine Learning in Cyber Security October 9th, 2019 Artificial Intelligence (AI) Machine Learning Deep Learning C I A Confidentiality Integrity Availability
  • 2.
    Outline 2  Information Security Applying Machine Learning Techniques  Cybersecurity Applications  Machine Learning  Case Study - Predicting Phishing Attack  Summary  Q & A
  • 3.
    Denial of Service Lossof IntegrityBankCustomer Deposit $1000 Deposit $ 100 Security Threats Loss of Privacy m-y-p-a-s-s-w-o-r-d d-a-n telnet company.org username: dan password: Impersonation I’m Bob. Send Me All Corporate Correspondence with Cisco. Bob CPU 3
  • 4.
  • 5.
    Implementing Information Security Risk Assessment Planning& Architecture Gap Analysis Integration & Deployment Operations Legal Compliance And Audit Crisis Management Continuous Monitoring
  • 6.
    Implementing Information Security Risk Assessment Planning& Architecture Gap Analysis Integration & Deployment Operations Legal Compliance And Audit Crisis Management Continuous Monitoring & Learning DATA Detection Correction Prevention • Servers • Database • Network Devices - Firewall/IDS/IPS/ - Routers/switches • Endpoint devices
  • 7.
    Machine Learning 7 Detection/Monitoring PreventionCorrection Descriptive Analytics Predictive Analytics Prescriptive Analytics Information Security Analytics/ Machine Learning • Detect Incidents • Monitor Traffic • Monitor Events • Prevent Attacks • Prevent Incidents • Isolate Systems Predict Attacks Predict Risks Predict Vulnerabilities
  • 8.
    Applying Machine LearningTechniques N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 8  Network Level (Router, Switches, Firewall, IDS/IPS, Cloud)  Endpoint (server, mobile, desktops, IoT)  User Level (Authentication, Social behavior, domain)  Application Level (Web, Applications, Database, ERP)  Process Level (Industry process and standards)
  • 9.
    Machine Learning N-U SigmaU2 Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 10 Method  Supervised Machine Learning  Unsupervised Machine Learning  Reinforcement Learning Tasks  Regression  Classification  Clustering  Association Rule Other ML Terms/Tasks • Dimensionality Reduction • Discriminant Analysis • Regularization, LASSO • Boosting • Generative Models • Deep Learning
  • 10.
  • 11.
    Descriptive Analytics  Informationand Awareness  Recording Security Breach – how, why and when  Monitoring  Provide statistics  Type of attacks  Type of breaches  Regions 12
  • 12.
    Descriptive Anaytics -Examples 13 Total ransomware Total Malware Reference: Symantec 2019 Report Ref: www.Symantec.com reports
  • 13.
    Quiz 1 Is DataVisualization Machine Learning?  A. TRUE  B. FALSE
  • 14.
  • 15.
    Email Classification  Categoriesas malware, spyware and ransomware 16 Machine Learning Model SPAM Not SPAM
  • 16.
    Predicting Fraud  Determinea probability of fraudulent actions.  Patterns of suspicious transactions,  Suspicious Users  Suspicious locations/hackers  Predict/Classify different types of network attacks  Spoofing, Phishing, TCP policy violations, etc 17
  • 17.
    Network Behaviour  Predictingnetwork traffic behaviour  Source (remote) IP address  Open TCP port  Packet content  Packet size  Or any of the hundreds of different attributes that network traffic can have  Predict the next packet parameters 18
  • 18.
    End point security Predict known types of attacks  SQLi, XSS, etc.  DDOS attacks  Find Pattern of user activity  On Social Media  Servers/Database/Web access  Authentications  Detect anomalies in HTTP requests (auth failures or bypass proxies or firewalls) 19
  • 19.
    Machine Learning Models Classification types of attacks  Exploits, Reconnaissance, DOS, Policy Violations  Predict user behavior  SIEM logs 20
  • 20.
    Unsupervised Machine Learning(Clustering)  Clustering of threat patterns on a network  Clustering security risks/Security Incidents  Clustering of user activity  Cluster User groups  Clustering Web traffic data  Clustering Vulnerabilities/Segments 21
  • 21.
    Quiz 2 Classifying emailas SPAM or NOT-SPAM is an example of: 1. Supervised Machine Learning 2. Unsupervised Machine Learning 3. My company machine learning 4. Association of Machine Learning
  • 22.
    PREDICTING PHISHING ATTACKS N-U Sigma U2 Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 Case Study 23
  • 23.
    N-U Sigma U2Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 24 Dataset Reference: Canadian Institute for Cybersecurity; https://www.unb.ca/cic/datasets/url-2016.html Exploring DATA
  • 24.
    Data N-U Sigma U2Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 25  80 different parameters collected:  1000 Data records  Response Variable/Predictor: BENIGN or PHISHING Dataset Reference: Canadian Institute for Cybersecurity; https://www.unb.ca/cic/datasets/url-2016.html
  • 25.
    Data Science /MachineLearning Framework Requirements DATA Science Deploy
  • 26.
    Classifying PHISHING/BENIGN N-U SigmaU2 Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 27  Applied Neural Network
  • 27.
    Results N-U Sigma U2Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 28 Dataset Reference: Canadian Institute for Cybersecurity; https://www.unb.ca/cic/datasets/url-2016.html
  • 28.
    N-U Sigma U2Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 29
  • 29.
    CORRECT I ON N-USigma U2 Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 Prescriptive Analytics 30
  • 30.
    Prescriptive Analytics  Automaticallyassign risk values for new vulnerabilities or misconfigurations  Automatically close inbox upond detection of ransomware attack  Identification of specific threats and creating controls to counter them  Security patches 31
  • 31.
    Summary N-U Sigma U2Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 32  Protecting data is critical to organization success  Cyber security crime is increasing day by day  Hackers are becoming smart  AI and Machine Learning are new technologies to prevent frauds by predicting future cyber attacks
  • 32.
    NU-Sigma U2 AnalyticsLabs  AI and Machine Learning Solutions  Enable organizations with AI and Machine Learning Technology  We have implemented projects for retail industry, Telecom, Healthcare and HR organizations  Conduct workshops: http://www.businessanalyticsr.com  BrightTalk channel:  https://www.brighttalk.com/channel/16781/umesh-hodeghatta
  • 33.
    Reference N-U Sigma U2Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093  Business Analytics Using R; Dr. Umesh Hodeghatta and Umesha Nayak, Springer Apress, USA, 2016  Infosec Handbook: Introduction to Information Security; Dr. Umesh Rao Hodeghatta and Umesha Nayak, Springer Apress, 2014  Almseidin, M., Alzubi, M., Kovacs, S., & Alkasassbeh, M. (2017, September). Evaluation of machine learning algorithms for intrusion detection system. In 2017 IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY) (pp. 000277-000282). IEEE.  Zamani, M., & Movahedi, M. (2013). Machine learning techniques for intrusion detection. arXiv preprint arXiv:1312.2177. 34
  • 34.
    References N-U Sigma U2Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 35  Juvonen, A., & Sipola, T. (2014). Anomaly Detection Framework Using Rule Extraction for Efficient Intrusion Detection. arXiv preprint arXiv:1410.7709.  Sun, L., Versteeg, S., Boztas, S., & Rao, A. (2016). Detecting anomalous user behavior using an extended isolation forest algorithm: an enterprise case study. arXiv preprint arXiv:1609.06676.  Mohammad Saiful Islam Mamun, Mohammad Ahmad Rathore, Arash Habibi Lashkari, Natalia Stakhanova and Ali A. Ghorbani, "Detecting Malicious URLs Using Lexical Analysis", Network and System Security, Springer International Publishing, P467--482, 2016.  Shah, S. A. R., & Issac, B. (2018). Performance comparison of intrusion detection systems and application of machine learning to Snort system. Future Generation Computer Systems, 80, 157-170.  Radford, B. J., Richardson, B. D., & Davis, S. E. (2018). Sequence aggregation rules for anomaly detection in computer network traffic. arXiv preprint arXiv:1805.03735.
  • 35.
    Reference (contd..) N-U SigmaU2 Analytics Lab web: www.businessanalyticsr.com email: umesh@businessanalyticsr.com Ph: +1 408757 0093 36  Tuor, A., Kaplan, S., Hutchinson, B., Nichols, N., & Robinson, S. (2017, March). Deep learning for unsupervised insider threat detection in structured cybersecurity data streams. In Workshops at the Thirty-First AAAI Conference on Artificial Intelligence.  Thi, N. N., & Le-Khac, N. A. (2017). One-class collective anomaly detection based on lstm- rnns. In Transactions on Large-Scale Data-and Knowledge-Centered Systems XXXVI (pp. 73-85). Springer, Berlin, Heidelberg.  Radford, B. J., Apolonio, L. M., Trias, A. J., & Simpson, J. A. (2018). Network traffic anomaly detection using recurrent neural networks. arXiv preprint arXiv:1803.10769.  Le, Q., Boydell, O., Mac Namee, B., & Scanlon, M. (2018). Deep learning at the shallow end: Malware classification for non-domain experts. Digital Investigation, 26, S118-S126.  Glander, S. (2017). Autoencoders and anomaly detection with machine learning in fraud analytics. shiring. github. io/machine_learning/2017/05/01/fraud.  Lotfollahi, M., Siavoshani, M. J., Zade, R. S. H., & Saberian, M. (2017). Deep packet: A novel approach for encrypted traffic classification using deep learning. Soft Computing, 1-14.
  • 36.