Some basic assumptions
By raw data we mean the original data that has been collected from a source and not yet processed or analysed. Raw data will provide the foundation for any downstream analyses. In many cases the captured or collected data may be unique and impossible to reproduce, such as time points in weather measurements and interviews. For this reason, they should be safeguarded from any possible loss. Moreover, raw data will typically be lossless - i.e. those file formats that are not compressed such as TIFF files for image data as opposed to compressed JPEG file format. Finally, in some cases, raw data may have additional information that may be specific to a brand and/or type of instrument used to capture the data. For example, Leica microscopes use a proprietary data format but is also a container for lossless data - the container contains metadata specific to the Leica microscopes that allows reading, writing and analysis through Leica software. See also our guide on file formats.
By processed data we mean data that has already undergone some kind of intervention. For instance, the data have been digitised, compressed, translated, transcribed, cleaned, validated, checked and/or anonymised.
By analysed data we mean data already processed, interpreted and analysed. Analysed data can assume several representations (text, tables, graphs, etc.), in order to facilitate a better understanding and communication of the data.
In most cases, one can also consider raw data as the official data, that is, the master copy of any given record (see also golden copy). As well as providing the starting point for derivatives generated downstream through analyses, there may be additional branches from which this data is used for other analyses. Therefore, in a typical workflow, we recommend that you create a copy of the raw data which you use as a "working copy". The original data should then be archived in an appropriate manner for long-term preservation. The working copy can then be used for processing and analysing without worrying about overwriting.
For more information on data formats please see this guide.