A confidential, de-identified tax and superannuation dataset produced by the Australian Taxation Office (ATO) based on a random 10 per cent sample of tax returns. While names, tax file numbers and a range of other direct identifiers have been removed, some indirect identifiers have been generalised. More than 400 variables are available and the dataset is updated annually.