# Large data set guidance and worksheets

The new AS and A-level Mathematics specifications require students to study a large data set during their course of study. More detail can be found in the DfE content document. The data set is chosen by each exam board, based on Ofqual guidance.

The current AQA large data set is taken from DEFRA’s Family Food Statistics publication and can be found on the AQA website. This dataset will be used for AS exams in 2018 and A-level exams in 2019. An new data set will apply for AS exams from 2019 and A-level exams from 2020. This dataset is taken from the Department for Transport (Transport Stock Vehicle Database).

## The large data set in exams

The new AS and A-level Mathematics exams will include questions or tasks that relate to the prescribed large data set, giving a material advantage to students who have studied it.

For example, in the specimen assessment materials students who had studied the large data set would have gained a material advantage through:

- understanding the categories and sub-categories that the large data set uses
- understanding how values in the large data set are rounded
- knowledge of trends in the data
- knowledge of outliers and other anomalies in the data.

The data set is too large to be taken into an exam but suitable extracts may be used in an exam question.

## Studying the data set

We recommend using the large data set as a classroom tool to support teaching the statistics content of the specification. This will help students build the familiarity with the data set that will confer the material advantage in an exam and also familiarise them with working with and manipulating data.

To help you do this, we have created three tools for use with the data set. These can be found on the coloured tabs of this amended version of the large data set spreadsheet.

The large data set contains time series data in the form of average weekly purchases per person. It is most suited to analysis using time series graphs and scatter diagrams, covered in section L3 of the specification.

The three tools extract data from the large data set and present it in time series graphs or scatter diagrams. They enable you to examine some of the key features of the data in the classroom and set students activities to complete themselves.

The tools are designed to be easy to use and to save you time extracting data from the data set. Minimal knowledge of Excel is assumed.

Choosing what to investigate and finding interesting questions to consider will depend on sharing ideas. You can find some suggestions in the accompanying worksheets, which can be downloaded below.

### Tool 1: Scatter diagram – 1 year

This tool allows you to select two foodstuffs and one year, and plots the data for the nine regions on a scatter diagram (note that there is no point plotted for England).

### Tool 2: Scatter diagram – 1 region

This tool allows you to select two foodstuffs and one region (including England) and plots the data for the 14 years on a scatter diagram.

### Tool 3: Time series

This tool allows you to select a foodstuff and produces a time series graph showing how purchases have varied since 2001–2.

You can select which regions are displayed to make your graph easier to interpret.