Beyond tax gap – how a better understanding of tax performance changes tax administration

Last updated 29 November 2021

Jeremy Hirschhorn, Second Commissioner, Client Engagement
Speech delivered to the 14th International ATAX Tax Administration Conference
24 November 2021

Introduction

Good afternoon and thank you for the opportunity to speak to you today on an area of tax administration that I am passionate about, what we at the Australian Taxation Office (ATO) are calling ‘beyond tax gap thinking’.

‘Beyond gap’ thinking recognises that the tax gap program is much more than an informational resource for the community: it can be utilised by revenue authorities to drive strategic and operational thinking and prioritisation.

It is the natural next step in using the insights from our tax gap program to make it easier for taxpayers to meet their obligations and to inform future compliance approaches based on our experience of particular market segments, which the Commissioner touched on yesterday in his address.

Today, I will start off talking a bit about why understanding system-health is important and then go on to explore 5 themes:

Measuring our impact, not just our effort
Journey of tax gap
Moving to ‘beyond gap thinking’
Shifting from ‘risk’ to ‘tolerance’
Designing the system around verifiable data.

Hopefully I can also leave you with a few reflections of your own, when considering ‘beyond gap thinking’ in the context of tax systems worldwide.

Why understanding system health is important

Traditionally, revenue authorities have tended to measure themselves in terms of tax performance on the number of audits undertaken and the amount of additional revenue raised. And while this might tell a revenue authority that they are effective tax auditors, if we do not assess our actions and outcomes in the context of the overall system in operation, it cannot tell us how effective we are at managing the overall tax and super system.

Our traditional performance metrics (such as audits and liabilities raised) are important, but they only focus on a part of what we do. On their own, they don’t tell us if they were the right things to do, whether doing them has had the longer-term impact we planned, nor what was driving the non-compliance in the first place.

To draw an analogy with the World Anti-Doping Authority (WADA) at the Olympics – certainly the number of athletes caught doping is a proxy for how hard WADA is working in the moment. However, it is not necessarily a good proxy for what competitors and spectators are hoping for – a clean Olympics (this time and into the future). In fact, a high ‘hit rate’ for WADA might be a signal that doping is increasing or rife (and counterproductive, in that it convinces other athletes that they also need to dope in order to be competitive). By contrast, a low ‘hit rate’ might indicate that the Olympics are pretty much clean – or that WADA is behind the curve of the new generation of drugs.

What is actually important, is not the number caught, but the number missed. And of course, this is very hard to measure with precision. It also requires both an organisation and its stakeholders to move beyond simplistic ‘effort’ metrics (how many things did you do) to more subtle client-centric metrics (what impact you have), which might take time to unfold.

Another way of putting this is: imagine a world where every athlete was clean (or everyone simply just paid their correct amount of tax each year) – the ultimate goal of WADA (revenue authorities). Traditional metrics based around hit rates (audit liabilities) would ironically show that the organisation is failing, just at the moment it is achieving its ultimate success!

Our tax gap program provides us with a broader context and insights in which we can consider our activities and the effect we can have on the system as a whole. This is because it lifts our perspective above the direct influence of individual compliance activities and helps us paint a more comprehensive picture. By trying to quantify the non-compliance behaviours across the whole tax and superannuation system and understand the drivers of the non-compliance, rather than just focusing on the entities the ATO treats, we can more effectively design strategies aimed at closing the gap and improving overall system performance.

But having that information is only half of the equation; the next step is deciding what we do with it. How do we use our increased knowledge to shape the system for the better? And, how do we influence and make it easier for clients in each market to meet their obligations?

Most taxpayers in the Australian system are positively inclined to do the right thing (albeit may not be exuberant or enthusiastic about their tax obligations), so the best way to manage or reduce the gaps is to understand what drives client behaviour and make changes that make it as easy as possible to comply. Some of the most effective strategies from a system performance perspective may not relate to our compliance activities at all and instead may come from system, technology or automation changes.

The other challenge is that tax gap insights can help drive macro decisions as to allocation of resources and strategies, but they are too high level and take too long to calculate to measure effectiveness of actions as they are taking place. This means that a range of supporting metrics are required.

Client behaviours are also influenced by the pipeline – that is, their end-to-end experience with us that incorporates all those individual interactions. So, in order to effectively manage the gaps and to have an impact on overall system performance, we need to be creating solutions at a whole-of-ATO level that consider the individual client’s experience in a holistic way. These 'pipeline' metrics ideally ensure that teams are not just thinking about their own 'work-bench' or their own efficiency in isolation of the overall client experience. This is an area deserving its own paper, perhaps for the next conference!

This improved understanding of system health is also critical in the ATO providing advice to Treasury and the government as to potential policy changes or funding of compliance programs – or in public service language, ‘New Policy Proposals’ (or NPPs).

By being able to communicate the size and drivers of particular areas of non-compliance, it can assist in determining the priority of different policy initiatives, as well as informing the potential policy solution. If the potential policy solution is, in whole or part, increased ATO compliance activity, then it also allows a better understanding of the potential ‘administrative dividend’ (the net impact on the Federal budget) in a much more sophisticated way than the traditional ‘pay-off ratio’ approach.

Measuring our impact, not just our effort

‘Measure what’s important, don’t make important what you can measure.’ – Robert McNamara

‘When you can measure what you are speaking about … you know something about it; but when you cannot measure it… your knowledge is of a meagre and unsatisfactory kind …’ – Lord Kelvin

‘What gets measured gets done.’ – management truism

A good audit will resolve a tax issue and might result in a tax liability being raised. A better audit will see that tax liability collected. The best audit will see the taxpayer’s behaviour changed and will result in sustained compliance over time, as well as hopefully indirectly influencing other taxpayers.

As a large government organisation, we measure and report on many things for many purposes with many specific measures which we are mandated to publish in the Commissioner of Taxation annual report. The focus of this paper is our work on expanding these traditional metrics and our work towards how we measure the overall impact we have and using that to drive future improvement.

The measurement of our true overall impact is critical for 2 major reasons.

It provides good insight to other stakeholders as to the health of the Australian taxation system and the effectiveness of the ATO.

By measuring what is important, the organisation will tend to shape itself to focus on what matters most.

We have always been very strong at measuring our efforts and the direct revenue raised. We measure and report on the number of activities, the amount of liabilities raised and any collected audit yield. These are good output-based measures that you might see replicated across most tax and revenue agencies. This is because they can be accurately measured but also because they are easy to measure (and are superficially a good proxy for the impact of a revenue authority on system health).

The challenge we have set ourselves is how we can improve our performance framework by focusing on:

outcome-based metrics that consider overall system performance, as well as
a complementary suite of metrics that are framed around the client – because how a client thinks and feels has an impact on the choices they make in relation to their compliance in the future.

As above, these 'pipeline metrics' are critical, but too big a topic to cover today.

The first incremental layer is to make sure we pay appropriate attention to valuing the effect of our early interventions. It is a truism that prevention is better than correction (both in the moment but also in terms of future years tax morale). But if you don’t measure the true effect of the prevention, then it is likely your organisation will do too little of it (and you won’t be communicating to stakeholders your true impact).

As such, and while more challenging to measure with the accuracy of audit yield, our focus is increasingly shifting to measuring the impact we have in improving voluntary compliance up front, be it to lodge on time, to report the right amounts or to pay on time. And when things aren’t right, our new focus is on measuring where we can correct this quickly and with lower impact on the client.

Data sharing and transparency, public advice and guidance, pre-fill, prompter letters and text messages are all designed to ensure the taxpayer gets it right when they lodge and pay the right amount of tax and superannuation. While it is more challenging to accurately measure the impact that these upfront or reminder activities have on improving compliance, we know the more we can help to ensure taxpayers lodge, get it right up front, and make timely payments, the better that is for the system overall. Importantly, this ensures maintainable compliance and will sustainably reduce the gross tax gap.

Because we know that not everyone gets it right and taxpayers prefer faster identification and resolution of issues (taxpayer preferences and experience being important in their own right but also very relevant for tax morale in future years), where possible we are also focused on increasing our activities aimed at correcting issues post-lodgment but before issuing a notice of assessment. This includes some of our risk reviews and our operational analytics models that use third-party data to automatically correct the client’s return. Measuring the impact of this work is important, more so as we drive to make sure taxpayers get it right, so they never need to be subject to an audit, or debt recovery. We have been measuring the impacts of these efforts and including the direct impact as a part the $6.8 billion of audit yield reported this year.

Another important layer is measuring the enduring effect of our efforts into future years. We know that when we undertake an audit and work with the taxpayer to help them better understand their obligations, we have an effect on voluntary compliance in future years (although importantly, an audit intervention which is perceived as unfair has the potential for adverse effects on future compliance). Similarly, when we disrupt tax avoidance schemes, or stop incorrect refunds we can prevent further tax losses from arising into the future. Along with the impact of helping taxpayers get it right up front, the impacts from sustaining voluntary compliance form part of wider revenue effects and last year, they totalled $3.5 billion.

This year we’ve also included for the first time $1.2 billion of compliance outcomes from stimulus programs, the largest component of which is our work in stopping on-going claims for JobKeeper. While implementing the stimulus packages and the subsequent compliance work is different from our normal tax compliance activities, the concepts of performance, gap thinking and tolerance helped us in developing appropriate measurement methodologies for this work.

When we combine audit yield, stimulus and wider revenue effects, we estimate that ATO action has resulted in $11.5 billion of additional collections last year. We know we have a bigger impact than that but, as yet, we haven’t been able to fully quantify what that impact is.

It is also important to note that some interventions may be able to be estimated, but not at a level of reliability that supports reporting in our annual report. However, these estimates remain fundamentally important, because they can be communicated at team level and used for internal recognition purposes.

Table 1: Total revenue effects 2020–21
Term	Meaning	Amount
Audit yield	The amount of tax we collect from audits and enforcement activities	$6.8 billion
Stimulus program impacts	The reduction and collection of overpayments from stimulus programs	$1.2 billion
Wider revenue effects	The additional revenue received from taxpayers. This is typically through improved voluntary compliance	$3.5 billion
Total revenue effects	The total impact the ATO activities have on improving taxpayer compliance	$11.5 billion

Thinking about this another way, the vast majority of revenue is collected voluntarily, which includes both voluntary payments, and additional payments resulting from ATO actions in the past that have improved voluntary compliance. Amendments, or audit yield, make up a very small percentage of total ATO collections – they average 3% over the past 3 years. The tax gap represents about 7% of total payments not collected each year.

The challenge is to reduce the tax gap in any given year and importantly keep it down in future years. Under this framework, audit yield is the conversion of ‘year one’ tax gap into tax collections. But if we are to sustainably improve the health of the (currently very highly performing) system, first we must focus on maintaining voluntary compliance. Then, when we look at the impact of (all of) our interventions, we want not just to have a ‘year one’ impact, but to lock that effect into future years’ voluntary compliance. There is little point, in a world of constrained resources, in simply ‘harvesting’ year one non-compliance, only to have that non-compliance grow back the next year.

To achieve this, we need to maximise the use of a range of preventative and deterrent strategies that aim to help taxpayers get it right the first time. These include:

public advice and guidance (such as web material, practical compliance guidelines and taxpayer alerts)
other advice and guidance products (such as private rulings and leverage strategies – for example, those directed towards advisors and intermediaries).

At the other end of the spectrum, it includes activities like disrupting syndicated criminal and phoenix activities. Each type of activity needs to be considered separately to determine impact.

Comprehensively measuring the effects our preventive and deterrent activities, is challenging, but critical if we are to best understand the relative impacts of different strategies and interventions and focus on what matters most.

Journey of tax gap

International best practice approaches/comparisons

Measuring tax gaps requires sophisticated analysis, as well as the willingness to commit non-trivial resources. It then requires confidence to publish them, as instead of being able to point to (generally high) audit yield pay-off ratios, a revenue authority is publishing the bits it missed …

In the 1970s, the United States Internal Revenue Service (IRS) pioneered the approach of measuring tax administration performance using a large-scale random audit program which evolved into its tax gap program that we see today. While it continues to be a strong advocate for tax gap research and is an active member of the OECD Tax Gap Community of Practice, the IRS only publishes a new set of tax gap estimates approximately every 7 years, with its latest publication covering the 2011 to 2013 tax years.

Since then, other jurisdictions such as Australia, Canada, Denmark, Sweden, and the United Kingdom have undertaken the measurement and publication of tax gaps to varying degrees and are all now part of the OECD Tax Gap Community of Practice. Like Australia, all of these jurisdictions use a range of approaches to estimate the tax gap. Depending on the tax or market, this includes estimates based on random enquiries, statistical approaches that make use of use of data generated from risk-based review and audit programs, and the use of expert judgment to ensure assumptions are defensible. Out of all these jurisdictions, the HMRC is perhaps the closest to the ATO in terms of the extensive scope and timeliness of tax gap estimates being refreshed and published.

The UK HMRC started publishing their tax gap estimates in the mid-2000s and have committed to a yearly refresh of a comprehensive suite of tax gap estimates. Like the ATO, they also use a range of methods: top-down, bottom-up and random enquiry program (REP) to generate their estimates. Like us, the UK also use the insights from their REP to inform incidence and types of non-compliance behaviours.

We began our foray into estimating and publishing tax gap estimates in 2012, by releasing the GST and luxury car tax (LCT) gaps. GST represented a sensible starting point for the tax gap program given its broad-based nature and readily available data that could be used to estimate the gap, including household spending data published by the Australian Bureau of Statistics.

Since then, we have released gap estimates across effectively the complete range of taxes and programs. In October this year, we published refreshed gap estimates for all income taxes, transactional taxes and administered programs. Most of these published gaps relate to the 2018–19 year, but for 3 transactional taxes and 1 administered program, our latest estimates are for the 2019–20 year. This year marks the second year we have released estimates for every income and transactional tax, and that allows us to measure the total tax performance of the system.

Our overall estimate of tax performance for 2018–19 is 92.7%. That is, we estimated that we received 92.7% of the total tax revenue that should be reported according to (current) law, which represents $428 billion dollars. Of that, we estimate 91% is voluntarily reported by taxpayers and the additional 1.7% is the result of amendments that include those that directly result from audits.

The gap estimates tell us that there is another $33.5 billion (or 7.3%) that was not collected. That represents the tax gap, that is the amount not correctly reported under the current law. While the different tax legislations and economic structures make direct comparisons of the magnitudes of tax gap estimates across jurisdictions challenging, Australia’s overall tax performance of more than 92.7% is nevertheless still an enviable outcome by OECD standards.

When we look at our performance in more detail, we can see some notable variations in the performance of different markets, and indeed, across different types of taxes. When we look firstly at income tax, we see that, after compliance activity, large business performs better than medium businesses who perform better than small businesses. But that doesn’t tell us the whole story.

When you look at the gross performance, we see that large businesses are performing at around 92% at lodgment, which is around the same as medium businesses but markedly lower than individuals not in business. Understanding the difference between gross and net performance is important in understanding what needs to be done to sustainably improve tax performance. Amendments are a much more important driver of performance in the large and complex markets than they are across small businesses and individuals.

When we look outside of income taxes, we find that all of our indirect taxes and administered programs are performing at more than 90%, with the exception of fringe benefits tax (if viewed as a standalone tax, noting that there are good arguments for treating FBT as a component of the PAYG withholding system for wages, akin to an anti-avoidance provision). FBT tax performance continues to run at under 80%.

There are a number of reasons why the performance of FBT lags the rest of the tax system. These includes issues around:

not understanding when FBT applies
not understanding or correctly applying the law
employers not engaging the expertise of tax agent in a similar way they engage with tax agents for income tax and GST issues.

Improving FBT performance requires a longer-term focus and needs us to think about how much of the FBT gap we can close and what we can do to improve performance. This is where beyond gap thinking comes to the fore.

Moving to ‘beyond gap thinking’

Having published the full tax performance program, the logical next steps are to use the insights gained from the program and our operational intelligence to determine whether what we are doing is going to improve the system, and, if not, what changes need to be made to improve tax performance.

Beyond gap thinking is about distilling the insights from our tax performance research to inform resource-allocation decisions at a strategic level. By inspecting the building blocks of each gap closely, we start to get a much more granular picture of how our actions impact on the performance of the tax system. More importantly, we start to get a better sense of how sensitive performance is to our action and our inaction over time.

We can use this to tell us what action we need to take to put upward pressure on tax performance, and also the extent to which not doing something would release that pressure on performance. We become better at predicting the consequences of our actions, so we make much more informed decisions of what we do and when we do it. That said, tax gap is a lagging performance indicator and is best used in the context of longer time frames for planning purposes.

To realise beyond gap thinking, we are developing 4 new concepts.

The first pair, which define the theoretical performance range, are:

maximum tax performance (the addressable gap), and
baseline tax performance (the gap at risk).

The second pair, which define pragmatic resource allocation decisions, are:

aspirational tax performance, and
tolerable tax performance.

To expand on these concepts, maximum tax performance and the addressable gap is not 100% tax performance or a zero-tax gap (much the way that ‘full employment’ does not mean zero unemployment). Rather, it is the theoretical maximum revenue that could be collected within the current legislative framework and plausible levels of resourcing.

On the flip side, we have baseline tax performance or the ‘gap at risk’, which is the performance that would result if the ATO reduced its investment and strategies to a baseline investment. Or, in other words, enough non-discretionary investment to keep the system functioning but discretionary investment – such as post-lodgment compliance staff – would be deployed elsewhere.

Importantly, the difference between baseline performance and current performance at lodgment (gross gap) can be seen as the value of the 'revenue authority in force'. It is the performance due to the current allocation of resources. Even though this is extremely difficult to accurately estimate, it is critical not to overlook this contribution – if a decision is made to redeploy or reduce resources, this gap can quickly open, offsetting any targeted improvements elsewhere.

Aspirational performance/gap is the performance the ATO is targeting through its strategies and resourcing. In a healthy system such as Australia’s, aspirational performance may not be much higher than current performance for some markets or taxes. Tolerable performance/gap is the bare level of performance acceptable to stakeholders in the tax system. If we get below this level, we are out of tolerance and this tells us we need to do something differently to bring the tax gap back into tolerance. This might involve a change in our investment mix (doing more), or a change in strategies (doing something different), or perhaps both.

Sitting beneath this new framework is the concept that the future matters more than today. For a tax agency, this means developing a tax system that focuses on preventative strategies rather than corrective ones, and when we undertake a corrective action, the focus is on sustained, ongoing improvements in compliance – an increase in the wider revenue effects. There is also significant value in actions (often policy measures) which do not of themselves immediately increase current tax performance, but structurally improve baseline tax performance.

A good example to draw out these concepts is the individuals not in business market. While the tax performance for individuals is high relative to some other market segments, cumulative small mistakes over a very large population results in a tax gap that is significant in dollar terms. The latest gap estimate is $8.4 billion for 2018–19. Amendments in this market represent between $600 and $800 million per year.

Focusing on current performance/gap, the performance of the system is relatively high at more than 94.5% – and particularly high at the voluntary compliance level of 94% (that is, the lowest ‘gross gap’) with only half of a per cent due to amendments.

Looking at ‘baseline performance’, by definition it will be less than the current 94% performance at lodgment. The question is by how much, and how quickly, that would degrade if the ATO reduced or stopped traditional audit activity. Importantly, system controls, such as the PAYG withholding system, as well as third-party data matching puts a floor under the baseline tax performance, such that even if we stopped looking at individuals for twelve months, these inbuilt features of the tax system would prevent significant degradation.

Another important feature which increases baseline tax performance (as well as current tax performance) is how easy it is to interact with the tax system (by removing disincentives to participate. Further system controls could further increase this baseline tax performance. This could include changes within the current policy framework such as third-party data matching on the deductions of individuals, as well as those outside of current policy like the introduction of a standard deduction for work-related expenses. Measures and initiatives which increase baseline tax performance will often be worthwhile even if there is marginal short-term impact on current tax performance, but of course are particularly valuable when they improve both!

If we now look at maximum performance, the drivers of the gap, being many small errors, mean that there is likely to be a relatively low level of maximum performance based on current policy settings. If current investment in compliance is only reducing the gap by 0.5%, and if it is conservatively assumed that compliance results are linear, to move performance from 94.5% to 98% you would have to have a compliance workforce at least 8 times the current investment, not a plausible position. As such, in the absence of significant new strategies or changes in policy settings, the maximum performance of the individual market in reality might be, say, 96% (and even then would require more sophisticated strategies than simply more audits).

Now focusing on aspirational performance, to improve performance, we must look to both the issues and behaviours that contribute to the $8.4 billion gap in individuals not in business and ensure strategies are addressing both underlying tax issues to improve performance immediately and the behaviour to sustainably improve performance. For example, work-related deductions and rental-related deductions are large contributors to the individuals tax gap, but we also know that strategies based on audit activity are not going to dramatically change those contributors without either significant increases in resources or productivity.

This leads to exploring different strategies if we aspire to a reduced gap in this market, for example if we aimed to increase performance from 94.5% to 95%. This might include productivity measures (such as automating routine aspects of an audit process, such as collection and collation of substantiation materials), but also accessing and utilising new ‘large-scale’ third party data (such as rental income data or mortgage data). It might also involve measures which increase perceived detection risk for the small group inclined not to fully comply, but also a focus on education for those who wish to comply but simply make mistakes due to the complexity of the system. Our 'nearest neighbour' and other real time 'nudges' are also a critical part of our strategies to increase performance in this market.

Lastly, if we consider tolerable performance. At 94.5% performance, the individuals market is performing at a higher level (particularly at lodgment) than other markets. It is also performing well compared with historical trends. As such, it is very likely that the current performance is within tolerance.

If we look at the other end of the spectrum, we see that large businesses now pay around 96% of the amount of tax they should pay, but only after compliance activity (noting that this is 2018–19 data, our hope is that our strategies over the last few years, including justified trust, mean that current performance has further improved). Our current performance is not far off our aspirational target of getting large business performance up to around 98%. We know that this aspirational target isn’t too far below the maximum tax performance, which we estimate at approaching 99%. This could theoretically occur if we could staff the Tax Avoidance Taskforce to move us from a justified trust review of the top 1,000 on a rolling 3-year basis, to a justified trust review of the top 1,000 on a yearly basis. This would not, however, be an optimal allocation of discretionary resources and the resulting investment would not be commensurate with the level of risk we see across the large market (or the tax system as a whole).

On the other hand, if we reduced our investment in the large market, one of the first things we would see is up to $2 billion of amendments disappear and the 96% performance would fall to something closer to 92% (and the gross gap could expand also). Or, in gap terms, the net gap would converge to the gross gap. At that point, we are right at, if not below, our tolerable gap for large business. The 92% performance in that market signals to all markets that the system isn’t operating equitably or fairly.

We have seen these levels before, and they were the impetus for the Government providing additional funding for the Tax Avoidance Taskforce to address some of the key issues with multinational tax avoidance. If we reduced to a baseline investment in the large market, we would see compliance results disappear and also voluntary compliance start to deteriorate. The baseline performance could quickly shift to somewhere between 85 to 90%.

Once again, system controls, like corporate governance (of both companies and their advisers), dividend imputation, corporate tax transparency and the voluntary tax transparency code would put a floor under the baseline tax performance, but this level of performance in the large market would not be acceptable to the government, or the community. We also know that perceptions of performance in the large market affects performance across all other markets, which also leads to our aspiration to improve performance to 96/98, or 96% correct at lodgment and 98% after ATO action.

Beyond gap thinking will help us set not only the optimal investment across each of the gaps to drive towards our aspirational gap, but also with the best knowledge that where we (relatively) disinvest, we won’t see significant disruption to the tax system. This will allow us to realise the ambition of sustainably reducing the tax gap without just ‘squeezing the balloon’ and seeing gains made in one area eroded by losses in another.

We can see that the challenge to reduce the income tax gap requires improvements in performance across all segments, but predominately small business, individuals and large business. Short-term success will see the tax gap component decrease and a commensurate increase in amendments. Long-term success sees both a reduction in the tax gap and amendments but more than offset by an increase in voluntary payments. This represents a system where the correct amount of tax is correctly reported to us at lodgment.

Shifting from ‘risk’ to ‘tolerance’

‘You want a valve that doesn’t leak and you try everything possible to design one. But the real world provides you with a leaky valve. You have to determine how much leakiness you can tolerate.’ – Arthur Rudolph

Tax performance below 100% is not a risk. It’s a certainty. The real question is, as a tax administrator (and on behalf of the community we serve), how much are we willing to tolerate? When we talk about reducing or managing the tax gaps, we’re not actually talking about having zero tax gaps for every market – the system is far too complex for that to ever be a realistic goal. In fact, for some of the gaps, our analysis might lead us to decide that we don’t want to reduce the gap at all – we might decide that the gap is already in a range we’re comfortable with.

Putting this another way, if we look at the individual tax gap of $8.4 billion. It is a certainty that there will be a large tax gap in this market in nominal terms. Currently sitting at 5.5%, perhaps it could range between 4% and 7% (refer the discussion about baseline and maximum tax performance). The language of 'risk' is not particularly helpful (it’s certain!). It is much more useful to use the concept of tolerance.

When we consider the question of tolerance, we also need to think about community perceptions and expectations. For example, in the USA, they appear to have a much higher tolerance in the larger markets, perhaps due to community views around the role of the Federal Government and the private sector.

The USA provides an interesting contrast when we compare the performance of the system. The latest official tax gap estimate for the US gap was 2011–13 with an estimate of US$441 billion. At the same time, tax collections in the US were US$2.3 trillion, giving a gross tax gap of around 16%. Recently, the IRS Commissioner told the Senate Finance Committee that the US tax gap could now exceed US$1 trillion this year with new sources of wealth including cryptocurrencies and the rising use of complex, foreign-source income, as well as sophisticated tax avoidance schemes. That is potentially a gap approaching 20%.

The US economy, as measured by GDP is approximately 15 times larger than Australia. At US$1 trillion, the gross gap represents almost 5% of the US economy. Australia’s gross tax gap is less than 2% of our economy.

For many years, we have used a risk framework to help us manage risk that is consistent with global best practices across the OECD and consistent with the standards published by the International Organisation of Standardisation. More recently, we have started to expand and evolve our understanding of risk:

first, by developing a 3-tiered approach to understanding tax (non) performance through a behavioural lens, and
second, by understanding and starting to set tolerances for tax non-performance.

The 3-tiered approach to tax (non) performance gives us a greater understanding of our priority investments to not only treat the non-performance, but to influence client behaviour. When we bring together what we know from managing risk with our insights from the tax gap program, we start to develop a very interesting picture of the behaviours we see across the various market segments, as well as how these behaviours impact on the performance of the tax system.

We can look at where we’re investing and see the likely impact investment will have in achieving sustained improvements in tax performance. Some of the investment will of course, be focused on dealing with non-performance post-lodgment, while other investments will focus on ensuring taxpayers have the right information available to meet their reporting obligations. This includes providing advice and guidance, such as private rulings, but also public advice and guidance like alerts and up-to-date and accurate advice on the ATO website.

Setting tolerances accepts the reality that tax performance will never be 100%. In the reality of finite budget and finite resourcing, we need to understand levels of tolerable tax performance or tolerable tax gap. As we get greater insights into what drives tax performance, we can also start to set tolerance for specific underlying behaviours that drive tax (non) performance. Having set tolerances, allows us to look at behaviour across the whole system and see what’s in tolerance and what’s out of tolerance. We can then look at what changes are required to bring something back into tolerance but in a way that doesn’t disrupt another part of the tax system, or risk it drifting out of tolerance.

For example, our initial analysis of small businesses and individuals in business (sole traders) has found that one of the drivers of the small business gap is the omission of income. The amount of omitted income is higher than what the community accept as tolerable – so much so, that this is one of the key risks that underpins the investment in the shadow (black) economy taskforce.

However, we find that it is not just shadow economy behaviours causing omitted income, although it is the largest behavioural driver. We also see opportunistic omission of income some of the time. But mostly, we see inadvertent under-reporting of income driven by errors and poor record-keeping. In fact, these behaviours account for around 90% of incidence of error, although only 20% of the under-reported tax. Being out of tolerance, deliberate strategies such as the Level Playing Field strategy – which is a part of the Shadow (black) economy program – are in place to bring this risk back into tolerance and improve tax performance of small businesses.

Obviously, there is significant judgment involved in estimating what is the tolerable level of tax non-performance for any given tax in any given market at any given point in time. One signal is to look back at when the government intervened to fund additional activity to address a particular element of the tax system and what was the level of tax performance of that part of the system at that time. This is a good starting point for determining the community’s level of tolerance.

Designing the system around verifiable data – prevention first

Our ongoing focus on holistic system-health will continue to be supported by increased digitisation and access to high-quality data, as we reduce the possibility for taxpayers to make inadvertent errors, recognising they are increasingly relying on natural systems for their tax and super affairs. This also has the effect of increasing the ‘baseline tax performance’ of the system.

Put simply, as more and more of our tax system can be designed around verifiable data, the system works better. If we simply viewed success as a lot of audit liabilities, we would have an interest in at least some people getting their returns wrong! If we view our job as having a good, effective tax system with a sustainably reduced gap, we want to help people to get their affairs right at lodgment.

To better understand the importance of data in the tax system, we often talk about the 3 phases of data evolution in tax systems worldwide.

The first is a ‘pre-data’ world where data was wholly in possession of the taxpayer. Revenue authorities relied on taxpayers to provide us with a bespoke data set (that is, a tax return) whose accuracy relied on the honesty of taxpayers, reinforced by the threat of audit or penalty.

We then moved to the second phase of what we call ‘data testing’ where third-party data became available at scale. As administrators began to access more and more data sets, we could undertake more comprehensive data matching and could cross-check what taxpayers were telling us to identify those tax filings which were most likely to be most incorrect. This in turn led to the identification of anomalies that require further investigation through audit programs.

The third phase, which we call ‘data driving’, is what the ATO and other leading revenue authorities are now in – that is where the system is primarily designed around verifiable data, rather than relying on bringing data to the system. This saves time and minimises the risk of inadvertent errors that have to be addressed later.

At the ATO, we are now starting to think about data on a curve: from ‘not verified’ data, through increasing levels of confidence in the data to ‘fully data-driven policy and system design’.

In Australia this is most evident in our individuals market where we can access and use data at an industrialised scale.

Level 0 is where there is no bulk data set available, such as work-related expense claims where taxpayers keep their receipts.
Level 1 is where we can obtain other data after the event to check that data, but maybe not at scale.
Level 2 is where the data can be sourced to be used as a risk indicator pre- or post-lodgment but it is not of a quality or type that would be productive to expose to taxpayers.
Level 3 is where the data is of a high enough quality that it can be used to assist taxpayers to comply as they lodge.
Level 4 is where the data is very high quality and can be used to pre-fill returns as presumptively correct.
Level 5 is where the data is so reliable that the tax system is actually designed around the data.

The level of data is based on the assessment of 6 attributes: identity, timeliness, standardisation, reliability, visibility and usefulness.

We now know that for the average individual taxpayer, most of their income data is of a good quality (Level 4) and is pre-filled. In practice, about 90% of the boxes pre-filled on their tax return remain unchanged by the taxpayer. Even for more complex items, such as capital gains, partial data can still assist taxpayers get their affairs right at lodgment (Level 3).

A good example of this is cryptocurrency. We get a lot of data from various cryptocurrency exchanges and we’re generally aware when a taxpayer has cryptocurrency investments and trades. While we will tend not to have sufficiently high-quality data to calculate the exact capital gains (Level 4), we do have enough information to inform the taxpayer that we’re aware of the fact that they have cryptocurrency trades. We can then remind them that these need to be included in the income tax return (Level 3). A more traditional revenue agency approach would be to remain silent until a return was lodged, and then selectively audit returns where we knew of cryptocurrency holdings but no gains were disclosed (Level 2).

We are getting towards the limit in Australia under current tax policy settings on what we can achieve for individuals. If you think of a standard deduction being the ultimate Level 5, there are obviously significant policy questions about the possible trade-offs of doing something similar here in Australia.

A good example of a system designed around verifiable data is the economic response to COVID-19, where the ATO’s successful delivery of key stimulus measures – JobKeeper, Cash Flow Boost and early release of super – relied on having existing high-quality systems and data sets in place, that could be used for more than one purpose.

In the early stages of the design of the stimulus package, we worked closely with the Department of Prime Minister and Cabinet and The Treasury to ensure the policy was designed around our existing infrastructure, including:

Australian business numbers for businesses
tax file numbers for employees
previously lodged income tax returns and activity statements
Single Touch Payroll.

For a business claiming cash flow boost, there were no new claiming obligations. They simply had to lodge their normal activity statement and our systems would automatically generate the cash flow boost credit.

For a business claiming JobKeeper, the process was a little more complex, but it was still fully automated and 97% of JobKeeper claims could be paid within 4 business days of the ATO processing their monthly claim. The policy was designed around these data sets to ensure it was almost impossible to access the programs unless the business was a pre-existing, real business with true employees, and with a track record of engagement with the tax system. It also meant it was easy to link the measures to our data and analytics risk engines. The design meant it was almost impossible to make fraudulent claims, as our data also allowed us to check eligibility for the assistance payments upfront and in real-time, or soon after application.

Reflections for the tax community

As revenue authorities, practitioners, academics and taxpayers, I think we can all agree that a contemporary tax system is underpinned by high levels of voluntary compliance.

At the ATO, we are increasingly using beyond tax gap thinking to help us focus our investment on the right risks and client behaviours, to make it easy for taxpayers to comply and to improve overall tax performance.

I hope my reflections today have been useful in helping you gain a better understanding of the ATO’s focus in this area and might prompt a few reflections when considering the health of tax systems worldwide:

What is being measured?
Are the measures transparent to the community?
How are insights from these measures used to improve tax performance outcomes for this year, and the baseline moving forward?
How can these insights inform policy discussion for the future?

Thank you for the opportunity to share this important work with you.

QC67379