Data integrity

When collecting your data, ensure that there is a sound process in place to verify and validate it. You should consider the following.

  • Relevance of the data - the data should actually measure the characteristic it claims to measure. This is particularly important if you're using a data set that has been created for another purpose. If using secondary data, you need to rigorously review your data specifications to ensure that the data is appropriate for your purpose.
  • Data availability and collection - ideally, data should be collected in a form that is tailored to your analysis requirements. This won't always be possible, particularly where secondary data is used. If using secondary data, you need to explore the underlying data properties to understand any limitations they may pose on your analysis.
  • Consistency of data - you must look for data that will be consistent over time. This allows for repeated observations to check whether the effects of your strategies are sustainable. If this isn't possible, look for ways to mitigate the limitations of the data in the longer term.
  • Timing of data - where data is sourced from external databases or reflects external market trends, you must make sure there are no issues with differing time frames when making comparisons. If there are any differences, you must account for the time lag in your analysis.
  • Expertise of data collectors - data quality can be impacted by the skills and expertise of the people involved in gathering the information. You may need to seek the advice of experts to ensure your data-collection techniques are appropriate.
    Last modified: 13 Jan 2015QC 25789