EFFICIENT HANDLING OF SUBJECT ACCESS
REQUESTS
TODAY’S WEBCAST
Presenters
Tom Gilsenan Johannes Scholtes
Director CSO
Informa ZyLAB
Agenda
 Terminology
 Process Similarities: Public Disclosure &
eDiscovery
 Why Automation is Needed: Challenges of
Public Disclosure
 Automation Best Practices
 Customer Profiles
 Conclusions, more information and
recommendations
WHAT
ARE WE
TALKING
ABOUT
Data Protection Acts 1988 and 2003: allow members of the public to obtain public records
from government (funded) bodies.
“AN ACT TO GIVE EFFECT TO THE CONVENTION FOR THE PROTECTION OF
INDIVIDUALS WITH REGARD TO AUTOMATIC PROCESSING OF PERSONAL DATA
AND FOR THAT PURPOSE TO REGULATE IN ACCORDANCE WITH ITS PROVISIONS
THE COLLECTION, PROCESSING, KEEPING, USE AND DISCLOSURE OF CERTAIN
INFORMATION RELATING TO INDIVIDUALS THAT IS PROCESSED AUTOMATICALLY. ”
Source: https://www.dataprotection.ie/documents/legal/CompendiumAct.pdf
DIFFERENT REGULATIONS
PUBLIC DISCLOSURE AND SUBJECT ACCESS REQUESTS
 Speed and completeness of disclosure
 Satisfying both the Government responsibilities and the rights of
the Requester
 Identifying records that meet exception, confidentiality, and
personal information criteria
 Maintaining transparency—defending exceptions in the court of
public opinion
 Defending the disclosure in the court
PROCEDURAL ISSUES
NUMBER OF REQUESTS IS GROWING
“It is now clear that
since 2014 there has
been an unprecedented
surge in the number of
AIE requests made to
Irish public authorities”
Source:
http://www.ocei.gov.ie/e
n/publications/annual-
reports/annualreport201
6/chapter5.html
 Increasing Volume: Number, size & breadth of requests e.g.,
documents mentioning of XYZ across all data sources
 Complex Data: Paper, architectural blueprints, un/structured,
audio, video, social media
 Distributed Data: By department, geography, on-premises &
cloud
 High Costs: Personnel manually search, process, review &
disclose; printing; IT infrastructure
 Short Timelines: Responses often required within 20 days
 New regulations for privacy and data protection add additional
complexity
CHALLENGES OF PUBLIC DISCLOSURE
 What is the definition of a draft? When do
these need to be disclosed?
 Dealing Personal Identifiable Information
(PII) and Protected Health Information (PHI).
 Handling litigation documents.
 Different deadlines for responding.
 Different cost structures (pay per page or
pay per request).
 Different redaction rules and different ways
to identify redactions in the documents.
 Different disclosure formats and methods.
VARIATIONS IN EXCEPTIONS, EXEMPTIONS, DEADLINES
THEREFOR WE NEED
AUTOMATION
DIY SELF-SERVICE UPLOAD
DIRECTLY COLLECT DATA FROM ITS ORIGINAL LOCATION
Upload extra information of the investigation yourself (PST, disks, USB)
SLIDE / 11
Full-Text index with the ZyLAB
IM Platform:
 File systems
 Legacy email collections (msg)
Collect only full-text query-based
information.
SLIDE / 12
SEARCH BASED COLLECTIONS
WHAT KIND OF AUTOMATION ARE WE TALKING ABOUT?
Deep Processing Analysis Review Acceleration
Support for 700+ file formats Email Threading Faceted Navigation &
Dashboards
Support for Compound &
Compressed formats
Deduplication &
Near Duplicate Analysis
Advanced Tagging Workflow
Embedded Object Extraction Advanced Entity Extraction Assisted Review with AI &
Machine Translation
OCR Non-searchable content Automatic Classification
(Pre-Tagging)
Manual & Automatic Redaction
Index Audio Content AI-Based Topic Modeling Flexible Production Formats
AUTOMATE DEDUPLICATION
• By Custodian
• By Matter
• De-duplication can be done
by hash value which can be
keyed off of different
metadata fields
NEAR DUPLICATE DETECTION
T1
T2T1 ~ T1
SLIDE / 16
COLLECTION REPORT
AUTOMATE CATEGORIZATION
Use Auto-Classification & Analytics Scenarios to slice, dice & tag your data
AUTOMATE DATA VISUALIZATION
Predefine Facets for Exemption Codes, Departments, Custodians etc.
 Privileged
information:
automatically identify
communications with
our lawyers.
 PII, PHI, and GDPR:
redaction and
pseudonymization
DEAL WITH LEGAL ASPECTS
SLIDE / 19
Structured and unstructured information (and all combinations)
AUTOMATE SEARCH
AUTOMATE TAGGING DECISIONS
Benefit from Reviewing in Full Context of Family Groups, Email Threads, then Bulk Tag
AUDIO SEARCH
SLIDE / 22
 Boolean keyword queries are often defined so they pick up a white range
of potentially relevant documents to avoid the risk of missing relevant data,
this results in picking up a lot of noise as well. Reviewing all these non-
relevant documents leads to higher review cost than essential.
 Highly experienced analysts with many years of experience who manage
all query options are able to reach recall levels of 70-80%, but most
normal investigators do not have all the knowledge to do so. As a result,
they often find only part of the answers.
 In both cases, the reviewer, analyst or investigator does not know exactly
how much they actually found and what is still missing.
By using machine learning we can tackle all the above problems.
BUT DOES EVERYBODY KNOW HOW TO SEARCH …
SLIDE / 23
SLIDE / 25
DEMO: ZYLAB MACHINE LEARNING ON ENRON DATA SET
0
200
400
600
800
1000
1200
1400
1600
ZyLAB Assisted Review Manual Review
Hours
MACHINE LEARNING: SMARTER, BETTER & FASTER
 15-20 faster than manual review
 10-20% more accurate, fully defensible
SLIDE / 26
SLIDE / 27
FLEXIBLE BUT POWERFUL PRODUCTIONS
AUTOMATE SHARING DISCLOSURES WITH THE PUBLIC
SLIDE / 28
https://nepis.epa.gov
 Industry-leading technology made affordable
 Advanced features available to all clients
 Manageable whether you have a large legal and/or litigation-
support department or none
 Scalable to support companies and firms of any size
FOR CASES OF ANY SIZE
 Your environment is ready within a week of signing a contract
 We will work with your to migrate existing matters or set up new
matters.
 Upload data to the matter or ship to ZyLAB Intake.
 Review User Training provided.
 Support and Additional Services available as needed.
HOW TO GET STARTED WITH SAAS
 Using ZyLAB software for the handling of FOIA and
Public Records Requests leads to huge time and
resource savings, less probability for errors and less
risk to damage your citizens, employees or
organization by accidental disclosures. Organizations
using ZyLAB for FOIA have quoted this functionality
to be a Productivity Revolution. - IT Director, City
Government, U.S.
 “At the end of an administration, turning over all email
to NARA could take up to a year. With ZyLAB, the
NSC can now just turn over the server to NARA.
ZyLAB’s XML format is our standard archival format
for e-mails.” - Jason Baron, (Former) Director of
Litigation at the National Archives
SLIDE / 32
 Integrate with open source and 3rd party case management tools.
 Collect and search directly across from email boxes, O365, file shares,
other electronic content repositories and even paper collections to identify
potentially relevant documents.
 Automatic classification of collected documents per department, document
type, custodian, withholding reasons, exemptions and many other relevant
document categories.
 Auto-redact Personal Identifiable Information (PII) and Protected Health
Information (PHI).
 Easy to use review and production functionality, including powerful review
accelerators and reporting functionality developed in the demanding world
of eDiscovery.
 Share user friendly and powerful search of old disclosures with public to
limit number of new requests.
BENEFITS OF USING ZYLAB FOR SUBJECT ACCESS
REQUESTS
ZYLAB BENEFIT TO IT
 ZyLAB’s direct collection and robust processing automates many of the
manual tasks IT normally has to perform on very short notice and often in
weekends by legal. This saves IT tremendous efforts, resources and
overtime.
 ZyLAB’s scalable and flexible architecture allows IT to scale up and scale
down eDiscovery resources as needed without the need to reinstall the
software or redeploy the data over multiple resources.
 ZyLAB’s ability to run both on-premises, private cloud, Azure or in hybrid
environments, allows organizations to select the most optimal architecture
for a specific eDiscovery projects.
 ZyLAB’s ability to run in Azure allows direct collection to O365 resources
from a close by (fast collection times) computer facility in the same
jurisdiction as the O365 runs in. In Azure, ZyLAB perfectly fits Azure ability
to spin-on and spin-off machines as needed.
34
SIGN UP FOR OUR COMMUNITY
MORE READING – WWW.ZYLAB.COM/RESOURCES/EBOOKS/
SLIDE / 37
Q&A
MORE INFORMATION: WWW.ZYLAB.COM
38
More ZyLAB Webinars and events:
https://zylab.com/company/event-calendar/

Efficiently Handling Subject Access Requests

  • 1.
    EFFICIENT HANDLING OFSUBJECT ACCESS REQUESTS
  • 2.
    TODAY’S WEBCAST Presenters Tom GilsenanJohannes Scholtes Director CSO Informa ZyLAB Agenda  Terminology  Process Similarities: Public Disclosure & eDiscovery  Why Automation is Needed: Challenges of Public Disclosure  Automation Best Practices  Customer Profiles  Conclusions, more information and recommendations
  • 3.
  • 4.
    Data Protection Acts1988 and 2003: allow members of the public to obtain public records from government (funded) bodies. “AN ACT TO GIVE EFFECT TO THE CONVENTION FOR THE PROTECTION OF INDIVIDUALS WITH REGARD TO AUTOMATIC PROCESSING OF PERSONAL DATA AND FOR THAT PURPOSE TO REGULATE IN ACCORDANCE WITH ITS PROVISIONS THE COLLECTION, PROCESSING, KEEPING, USE AND DISCLOSURE OF CERTAIN INFORMATION RELATING TO INDIVIDUALS THAT IS PROCESSED AUTOMATICALLY. ” Source: https://www.dataprotection.ie/documents/legal/CompendiumAct.pdf DIFFERENT REGULATIONS
  • 5.
    PUBLIC DISCLOSURE ANDSUBJECT ACCESS REQUESTS
  • 6.
     Speed andcompleteness of disclosure  Satisfying both the Government responsibilities and the rights of the Requester  Identifying records that meet exception, confidentiality, and personal information criteria  Maintaining transparency—defending exceptions in the court of public opinion  Defending the disclosure in the court PROCEDURAL ISSUES
  • 7.
    NUMBER OF REQUESTSIS GROWING “It is now clear that since 2014 there has been an unprecedented surge in the number of AIE requests made to Irish public authorities” Source: http://www.ocei.gov.ie/e n/publications/annual- reports/annualreport201 6/chapter5.html
  • 8.
     Increasing Volume:Number, size & breadth of requests e.g., documents mentioning of XYZ across all data sources  Complex Data: Paper, architectural blueprints, un/structured, audio, video, social media  Distributed Data: By department, geography, on-premises & cloud  High Costs: Personnel manually search, process, review & disclose; printing; IT infrastructure  Short Timelines: Responses often required within 20 days  New regulations for privacy and data protection add additional complexity CHALLENGES OF PUBLIC DISCLOSURE
  • 9.
     What isthe definition of a draft? When do these need to be disclosed?  Dealing Personal Identifiable Information (PII) and Protected Health Information (PHI).  Handling litigation documents.  Different deadlines for responding.  Different cost structures (pay per page or pay per request).  Different redaction rules and different ways to identify redactions in the documents.  Different disclosure formats and methods. VARIATIONS IN EXCEPTIONS, EXEMPTIONS, DEADLINES THEREFOR WE NEED AUTOMATION
  • 10.
  • 11.
    DIRECTLY COLLECT DATAFROM ITS ORIGINAL LOCATION Upload extra information of the investigation yourself (PST, disks, USB) SLIDE / 11
  • 12.
    Full-Text index withthe ZyLAB IM Platform:  File systems  Legacy email collections (msg) Collect only full-text query-based information. SLIDE / 12 SEARCH BASED COLLECTIONS
  • 13.
    WHAT KIND OFAUTOMATION ARE WE TALKING ABOUT? Deep Processing Analysis Review Acceleration Support for 700+ file formats Email Threading Faceted Navigation & Dashboards Support for Compound & Compressed formats Deduplication & Near Duplicate Analysis Advanced Tagging Workflow Embedded Object Extraction Advanced Entity Extraction Assisted Review with AI & Machine Translation OCR Non-searchable content Automatic Classification (Pre-Tagging) Manual & Automatic Redaction Index Audio Content AI-Based Topic Modeling Flexible Production Formats
  • 14.
    AUTOMATE DEDUPLICATION • ByCustodian • By Matter • De-duplication can be done by hash value which can be keyed off of different metadata fields
  • 15.
  • 16.
  • 17.
    AUTOMATE CATEGORIZATION Use Auto-Classification& Analytics Scenarios to slice, dice & tag your data
  • 18.
    AUTOMATE DATA VISUALIZATION PredefineFacets for Exemption Codes, Departments, Custodians etc.
  • 19.
     Privileged information: automatically identify communicationswith our lawyers.  PII, PHI, and GDPR: redaction and pseudonymization DEAL WITH LEGAL ASPECTS SLIDE / 19
  • 20.
    Structured and unstructuredinformation (and all combinations) AUTOMATE SEARCH
  • 21.
    AUTOMATE TAGGING DECISIONS Benefitfrom Reviewing in Full Context of Family Groups, Email Threads, then Bulk Tag
  • 22.
  • 23.
     Boolean keywordqueries are often defined so they pick up a white range of potentially relevant documents to avoid the risk of missing relevant data, this results in picking up a lot of noise as well. Reviewing all these non- relevant documents leads to higher review cost than essential.  Highly experienced analysts with many years of experience who manage all query options are able to reach recall levels of 70-80%, but most normal investigators do not have all the knowledge to do so. As a result, they often find only part of the answers.  In both cases, the reviewer, analyst or investigator does not know exactly how much they actually found and what is still missing. By using machine learning we can tackle all the above problems. BUT DOES EVERYBODY KNOW HOW TO SEARCH … SLIDE / 23
  • 25.
    SLIDE / 25 DEMO:ZYLAB MACHINE LEARNING ON ENRON DATA SET
  • 26.
    0 200 400 600 800 1000 1200 1400 1600 ZyLAB Assisted ReviewManual Review Hours MACHINE LEARNING: SMARTER, BETTER & FASTER  15-20 faster than manual review  10-20% more accurate, fully defensible SLIDE / 26
  • 27.
    SLIDE / 27 FLEXIBLEBUT POWERFUL PRODUCTIONS
  • 28.
    AUTOMATE SHARING DISCLOSURESWITH THE PUBLIC SLIDE / 28 https://nepis.epa.gov
  • 30.
     Industry-leading technologymade affordable  Advanced features available to all clients  Manageable whether you have a large legal and/or litigation- support department or none  Scalable to support companies and firms of any size FOR CASES OF ANY SIZE
  • 31.
     Your environmentis ready within a week of signing a contract  We will work with your to migrate existing matters or set up new matters.  Upload data to the matter or ship to ZyLAB Intake.  Review User Training provided.  Support and Additional Services available as needed. HOW TO GET STARTED WITH SAAS
  • 32.
     Using ZyLABsoftware for the handling of FOIA and Public Records Requests leads to huge time and resource savings, less probability for errors and less risk to damage your citizens, employees or organization by accidental disclosures. Organizations using ZyLAB for FOIA have quoted this functionality to be a Productivity Revolution. - IT Director, City Government, U.S.  “At the end of an administration, turning over all email to NARA could take up to a year. With ZyLAB, the NSC can now just turn over the server to NARA. ZyLAB’s XML format is our standard archival format for e-mails.” - Jason Baron, (Former) Director of Litigation at the National Archives SLIDE / 32
  • 33.
     Integrate withopen source and 3rd party case management tools.  Collect and search directly across from email boxes, O365, file shares, other electronic content repositories and even paper collections to identify potentially relevant documents.  Automatic classification of collected documents per department, document type, custodian, withholding reasons, exemptions and many other relevant document categories.  Auto-redact Personal Identifiable Information (PII) and Protected Health Information (PHI).  Easy to use review and production functionality, including powerful review accelerators and reporting functionality developed in the demanding world of eDiscovery.  Share user friendly and powerful search of old disclosures with public to limit number of new requests. BENEFITS OF USING ZYLAB FOR SUBJECT ACCESS REQUESTS
  • 34.
    ZYLAB BENEFIT TOIT  ZyLAB’s direct collection and robust processing automates many of the manual tasks IT normally has to perform on very short notice and often in weekends by legal. This saves IT tremendous efforts, resources and overtime.  ZyLAB’s scalable and flexible architecture allows IT to scale up and scale down eDiscovery resources as needed without the need to reinstall the software or redeploy the data over multiple resources.  ZyLAB’s ability to run both on-premises, private cloud, Azure or in hybrid environments, allows organizations to select the most optimal architecture for a specific eDiscovery projects.  ZyLAB’s ability to run in Azure allows direct collection to O365 resources from a close by (fast collection times) computer facility in the same jurisdiction as the O365 runs in. In Azure, ZyLAB perfectly fits Azure ability to spin-on and spin-off machines as needed. 34
  • 35.
    SIGN UP FOROUR COMMUNITY
  • 37.
    MORE READING –WWW.ZYLAB.COM/RESOURCES/EBOOKS/ SLIDE / 37
  • 38.
    Q&A MORE INFORMATION: WWW.ZYLAB.COM 38 MoreZyLAB Webinars and events: https://zylab.com/company/event-calendar/