# Available statistics format settings

This section lists most of the format settings used for statistics

## Mean scores, averages and other statistics from values or scores

There are two sorts of questions that will produce various statistics when used as the rows of a table. Quantities, usually Integer or Weight questions. For example, salary, height or volume. For these questions when used as the rows of the table either a full list of values can be requested, or the statistics only.

Scored questions, usually rating scales to which scores (analysis values) have been assigned. For example:

- Like a lot (2)
- Like a little (1)
- Indifferent (0)
- Dislike a little (-1)
- Dislike a lot (-2)

Tables which show the full list of values may be displayed with or without the individual rows (format DIS). Using format NDIS means that only the requested statistics are output (no distributions).

The values to be used for statistics are usually shown in parentheses (as above using format PSV). The decimal places used are controlled by format PSD.

When using these types of entries on your tables the following statistics may be produced:

- Base for statistics (format BST)
- Sum of values (format SUM)
- Sum of squares of values (format SSQ)
- Average or mean score (format AVG)
- Standard deviation (format SDV)
- Standard error (format SER)
- Error variance (format EVR)
- Mean score divided by standard error (format MSE)

The decimal places for AVG are controlled by format DPA. All other statistics are controlled by format DPS.

For tables with all rows listed (not statistics only tables) the following can also be requested:

- Medians (format MED)
- Quartiles to Deciles (format ILE)
- Maximum value (format ILH)
- Minimum value (format ILL)
- Modal value (format MOD)

IMPORTANT: The calculations listed above assume that the values are in ascending order. You should always use format RNA when tabulating quantity questions which list all rows.

## F-tests

If format TTF is used with SHG, an F-test is performed on all of the columns within each group. This test is used to establish whether the group of columns (for example - Area) affects the row mean or average, without looking at all of the individual pairs of columns.

## Non parametric tests

For table rows which are assumed to be in order (for example - Small, Medium and Large) but you do not wish to attach score values to the rows, two tests can be applied:

- Kolmogorov-Smirnoff using format KST
- Mann-Whitney-Wilcoxon using format MWW

For tables with any other distribution down the side use Chi-squared using format CHI.

## Significance testing

### Levels tested

Formats SLA, SLB, SLC and SLD control the levels to be tested. A setting of 101 means that the level is not used.

The standard setting for these formats is SLA95/SLB99/SLC101/SLD101.

If only lower case letters are used as identifiers upper case means SLB level.

SLC and SLD add + or ++ to the front of each marker.

You may need to increase CLG if you use significance testing markers.

### Distribution Z test markers

Where a table has a list of independent rows format SIG can be used to mark significant differences.

There must be a total row for the Z test calculations to work.

Each row of the table is treated separately and cells are marked depending on whether the prportion is different to the other columns in the same row in the column it is being compared with.

You can choose to use the combined variance (pooled) with SIG1, or separate variances (un-pooled) with SIG2.

### Mean or average t-test markers

Where a table has rows from which a mean score or average is produced, significant differences will be marked

**TIP:** If you do not wish to include these
markers, you should set format options SLA101/SLB101 (in
which case the MK marker rows will also be suppressed from
the Tables CSV file produced when your tables are run).

You may choose whether to use the combined variance with TTV1 or separate variances with TTV2.

You can request a complete grid of t-test values, comparing every column with every other column, using format TTT1. This grid can be reduced to testing only within each group of columns, by using format TTT2 combined with format SHG.

If a respondent rates two or more products, you could produce a banked table with the products as the breakdown and use column identifiers to compare them. A more accurate method which is not often used is to subtract the score for one product from the score for the other. For example, with scores 1 to 5 the relative score will be between –4 and +4. This relative score can then be tabulated and format MSE will give a t-test value comparing the mean score with the expected fixed value of 0.0. When the expected value for a mean is zero, if format MSE is >1.96, this is 95% significant and >2.57 is 99%.

## Other software

Data from QPSMR can be output to other software for further statistical tests.