Raw and Tidy Data - Yousef's Notes
Raw and Tidy Data

Raw and Tidy Data

#Raw Data

Data in the format in which they were collected. e.g. word documents, PDF, data series from an instrument, transcriptions, interviews, etc.

#Tidy Data

Data needs to be converted into a format suitable for [[Machine Leaning]], i.e., a set of feature vectors. E.g. spreadsheet (but also matrices or tensors) where each row represents an example, and each column a feature of the examples.