From the course: Security Risks in AI and Machine Learning: Categorizing Attacks and Failure Modes
Unlock this course with a free trial
Join today to access over 24,900 courses taught by industry experts.
Dataset threat model
From the course: Security Risks in AI and Machine Learning: Categorizing Attacks and Failure Modes
Dataset threat model
- [Instructor] If you've ever accidentally taught your autocorrect system that a typo is a actual word and then had to spend months correcting that typo every time your system tried to insert it, you know how frustrating it can be when machines learn the wrong thing. If data used to train an AI or provide context isn't accurate, the outcomes won't be either. That's why it's important to vet datasets and implement dataset hygiene policies. If your usage involves training, either training a new model, retraining an existing one, or fine tuning a foundation model, thinking through data hygiene, integrity and protection controls is extremely important because biased data leads to bias AI. But bias isn't always intentional. Consider an automated faucet that is programmed to turn on when the computer vision recognizes human hands in front of the faucet. If it is trained on large adult human hands of a particular skin color,…