This document discusses the concept of data in data mining. It defines data as a collection of objects and their attributes. Attributes describe objects and can take on attribute values. Attributes can be nominal, ordinal, interval or ratio depending on their properties. Datasets can consist of records, documents, transactions or graphs. The document also discusses data quality issues like noise, outliers, missing values and duplicates. Finally, it covers preprocessing techniques like aggregation, sampling, dimensionality reduction and discretization.