Show download pdf controls
  • How we measure tax gaps

    Our estimates aim to quantify the level of non-compliance across the four pillars of compliance – registration, lodgment, reporting and payment obligations.

    Where possible, we also estimate the amount of revenue not collected from those who fail to register or lodge. Penalties and interest are not included in gap estimates.

    We have two measures of the tax gap – the gross gap and the net gap.

    The gross gap is the difference between

    • the amount voluntarily reported to the ATO
    • the amount we would have collected if every taxpayer was fully compliant with tax law (that is, the theoretical tax liability).

    The net gap is the difference between

    • the amount voluntarily reported to the ATO plus amendments as a result of compliance activities and voluntary disclosures
    • the amount we would have collected if every taxpayer was fully compliant with tax law.

    Tax gap concepts

    We estimate gaps for the year the economic activity occurred. They are historical and based on the law and the administrative approaches at the time they are calculated.

    Figure 1 shows the components of the tax gap, including the net gap, the gross gap, the amount reported and theoretical tax liability.

    Figure 1: Components of tax gap

    Figure 1: This diagram shows the tax gap concepts. We look at the amount voluntarily reported, amendments due to compliance activities and voluntary disclosure, and the amount not paid, against the theoretical tax liability. The amount not paid is the next gap. The amount not paid plus the amendments is the gross gap.

    Return to:

    Tax gap methodology

    Our tax gap estimates are derived from a wide range of sources, including publicly-available information and ATO administrative data. Broadly, we apply either a top-down or bottom-up approach to estimating each gap (see Figure 2).

    • Top-down approaches use externally-provided aggregated data sources to estimate the size of the tax base, from which we estimate the theoretical tax liability. The difference between the theoretical tax liability and the amount we receive is the estimated tax gap. A top-down approach is typically used for indirect taxes.
    • Bottom-up approaches involve a detailed examination of data sources, such as tax returns, audit results (including random enquiry programs), risk registers or third-party data-matching information. We then extrapolate the results to determine the extent of non-compliance across the whole population, from which we estimate the tax gap. A bottom-up approach is typically used for direct taxes. There are three types (as described further below and shown in Figure 3).

    Figure 2: Our two approaches to estimate tax gaps

    Figure 2: This diagram shows our two approaches to gap estimation: The first approach is suited to indirect and volumetric taxes. It uses external aggregate data (for example, National Accounts data, industry data) and a top-down approach to calculate the theoretical tax liability. The second approach is suited to direct taxes. It uses internal administrative data (for example, random enquiry, compliance and illustrative data) and a bottom-up approach to calculate the gap sub-components, which are then extrapolated to the whole population.

    Choosing the methodology

    We choose the methodology that provides the most reliable estimate for each gap we measure. To do so, we carefully consider the characteristics of each gap, including:

    • the design of the tax or program
    • the characteristics of the population
    • availability and quality of data.

    Assessing these factors helps us decide which methodology is the most appropriate to use. For example, in order to use a top-down method we generally require external data. If we don't have a reliable external data source available, we know we'll need to use a bottom-up method to generate a reliable result.

    We assess our methodologies for reliability, and where possible test them against alternatives to ensure that we are using the most appropriate methodology. We also consult with our independent expert panel on the options available to us, and look to other jurisdictions to see what methodologies they use for similar gaps.

    We continually work to update and improve our gap estimates. Part of this involves assessing the methodology used, to ensure it's still the most appropriate option. This means we can remain confident that our gap estimates are reliable and credible.

    Gap approaches in detail

    This section provides a more detailed explanation of the top-down and bottom-up approaches we use to measure tax gap estimates (and these are summarised in Figure 3).

    Figure 3: Methodological approach for each gap estimate

    Overview of the four main methodological approaches we use to estimate gaps, with each of the published gaps under one of the four main methodological approaches. The gaps listed under the top-down approach are: fuel excise, PAYG withholding, goods and services tax and superannuation guarantee. The gaps listed under the bottom-up random enquiry program approach are: Individuals not in business, small business, fuel tax credits and small super funds. The gaps listed under the bottom-up model- based approach are: large corporate groups, large super funds, petroleum resource rent tax, tobacco, fuel tax credits and small super funds. Fuel tax credits and small super funds are both gaps that use a hybrid approach. The gaps listed under the final method, the bottom–up statistical approach, are high wealth and wine equalisation tax.

    Top-down approaches

    A top-down approach essentially looks at a system and breaks it down to understand each of its constituent parts, and how these work individually. Top-down approaches use external information about the system we are constructing an estimate for. This approach does not always provide information on why an outcome occurs, merely that an outcome has occurred.

    An example of this is the GST gap, which uses information collected through the Australian National Accounts data set. This data is collated by the Australian Bureau of Statistics (ABS) and, therefore, sits outside data collected by us – for example, audit data.

    Bottom-up approaches

    There are three broad types of bottom-up approaches:

    Random enquiry programs

    A random enquiry program (REP) is a process for selecting tax returns for evaluation. As the name suggests, the tax returns are randomly selected – which ensures that all have the same likelihood of being chosen.

    This is unlike operational audit selection processes, which focus on taxpayers considered to have a higher risk of non-compliance with a potentially large amount of tax at risk. Operational audit data is biased towards this 'high risk, high consequence' segment of taxpayers. In contrast, random selection avoids any systematic selection of segments of the population. It is designed to provide an unbiased representation of taxpayer information.

    Statistical-based approaches

    Statistical-based approaches are another bottom-up approach. They use a set of mathematical models to estimate an outcome where it would be impractical to obtain a data set that covers 100% of the population that is being estimated.

    Below is an overview and explanation of the types of statistical-based approaches used within the tax gap program to estimate various tax gaps.

    Regression analysis

    Regression analysis is a standard statistical technique for estimating the relationships between one outcome variable and a series of explanatory variables. The regression can be used to identify the probability or the magnitude of the tax gap using all available taxpayer records and compliance results.

    To produce reliable and credible results when using regression analysis, selection bias needs to be corrected for. Taxpayers that have undergone ATO compliance activity were selected based on a number of risk assessment processes. Performing statistical analysis on these non-randomly selected samples, without accounting for selection bias, can lead to inaccurate conclusions. We consider using the two methods to account for selection bias:

    • propensity score matching
    • Heckman’s correction.

    Regression analysis is useful in identifying characteristics that help predict whether or not a taxpayer is non-compliant, as well as characteristics that help predict the degree of non-compliance. Based on these characteristics, the size of the tax gap can be estimated for the taxpayers that are modelled to be non-compliant.

    Extreme value theory

    Extreme value theory is appropriate when the data is characterised by extreme outlier observations – for example, the data follows the 80/20 rule. That is, a small number of the data points (20%) make up a majority of the total value (80%).

    This is typically seen in data related to amendments to income tax returns, both positive and negative – from either taxpayer adjustments or as a result of our compliance activities. The 'extreme values' are identified at the point at which the positive amendments are cancelled out by the negative amendments. Figure 4 provides an example.

    Figure 4: Tax amendments ranked from highest to lowest

    Figure 4: This image is a graph that pictorially demonstrates how an extreme value theory model works. It provides a visual aid to the information contained in the text in this section.

    The values above the point that the values cancel each other out are the extreme values. The relationship between the size of the extreme values and their rank is estimated and applied to the population to inform the final estimate.

    Model-based approaches

    Where a random enquiry is not suitable, and available data does not match the assumptions required for a statistical approach, we use model-based approaches.

    These approaches identify the key themes, factors or channels that contribute to the gap, which are then used to inform the final estimate. Like all our estimates, they draw on all available data including expert judgment, management information and system data to inform the final estimate.

    These approaches can also be individually referred to as:

    • micro-analytical simulation
    • illustrative
    • channel analysis.

    The aspect they have in common is a disaggregation, analysis of known information, then an aggregation to a final estimate.

    Figure 5: Model disaggregation, analysis and re-aggregation

    Figure 5: This image is a visual representation of the information provided in this section. It visually displays the three main components of a model based approach. That is, model disaggregation, analysis of known information and finally aggregate and compare the results to obtain a final estimate.

    Return to:

      Last modified: 12 Mar 2020QC 53168