# Appendix: Reliability of the Estimates

Because the figures in this report are based on a sample of the older population, all reported statistics (counts, percentages, and medians) are only estimates of population parameters and may deviate somewhat from their true values—that is, from the values that would have been obtained from a complete census using the same questionnaires, instructions, and interviewers.^{1}

The standard error is primarily a measure of sampling variability—that is, it measures the variations that occur by chance because a sample rather than the entire population is surveyed. As calculated for this report, the standard error also partly measures the effect of response and enumeration errors but does not measure systematic biases in the data. The chances are about 68 out of 100 that an estimate for the sample would differ from a complete census figure by less than the standard error. The chances are about 95 out of 100 that the difference would be less than twice the standard error.

## Standard Error of Estimated Percentages

The reliability of an estimated percentage, computed by using sample data for both numerator and denominator, depends on both the size of the percentage and the size of the total on which the percentage is based. The approximate standard error S_{x} of an estimated percentage can be obtained using the formula
$${s}_{x,p}=\sqrt{\frac{b}{x}p(100-p)}$$

Here *x* is the total number of persons, families, or households (the base of the percentage), *p* is the percentage, and *b* is the parameter from the following table associated with the characteristic in the numerator of the percentage.

Characteristic | Total or white |
Black | Hispanic |
---|---|---|---|

Below poverty level | 3,927 | 3,927 | 3,927 |

All income levels | 2,454 | 2,810 | 2,810 |

Use of this formula in calculating the standard error of a single percentage is illustrated as follows:

An estimated 30 percent of units aged 65 or older had total money income of $30,000 or more in 2000 (Table 3.1). Because the base of this percentage is approximately 25,230,000—the number of units aged 65 or older—the standard error of the estimated 30 percent is approximately 0.5 percent. The chances are 68 out of 100 that the estimate would have shown a figure differing from a complete census by less than 0.5 percent. The chances are 95 out of 100 that the estimate would have shown a figure differing from a complete census by less than 1.0 percent—that is, this 95 percent confidence interval would range from 29.0 percent to 31.0 percent.

For a difference between two sample estimates, the standard error is approximately equal to the square root of the sum of the squares of the standard errors of each estimate considered separately. This formula will represent the actual standard error quite accurately for the difference between separate and uncorrelated characteristics. If, however, there is a high positive correlation between the two characteristics, the formula will overestimate the true standard error.

A comparison of the difference in the percentage of units aged 62-64 and 65 or older who had total money income of $30,000 or more in 2000 illustrates how to calculate the standard error of a difference between two percentages:

Thirty percent of the 25,230,000 units aged 65 or older and 50 percent of the 4,049,000 units aged 62-64 had total money income of $30,000 or more in 2000—a difference of 20 percentage points. The standard errors of those percentages are 0.5 and 1.2, respectively. The standard error of the estimated difference of 20 percentage points is about $$1.3=\sqrt{{(0.5)}^{2}+{(1.2)}^{2}}$$ The chances are 68 out of 100 that the difference is between 18.7 and 21.3 percentage points and 95 out of 100 that it is between 17.4 and 22.6 percentage points. Because the confidence interval around the difference does not include zero, there is a statistically significant difference between the proportions who are 62-64 and those who are 65 or older with income of $30,000 or more.

## Confidence Limits of Medians

The sampling variability of an estimated median depends on the distribution as well as on the size of the base. Confidence limits of a median based on sample data may be estimated as follows: (1) Using the appropriate base, the standard error of a 50 percent characteristic is determined; (2) the standard error determined in step 1 is added to and subtracted from 50 percent; and (3) the confidence interval around the median corresponding to the two points estimated in step 2 is then read from the distribution of the characteristic. A two-standard-error confidence limit may be determined by finding the values corresponding to 50 percent plus and minus twice the standard error. This procedure may be illustrated as follows:

The median total money income of the estimated 25,230,000 units aged 65 or older was $18,778 in 2000 (Table 3.1). The standard error of 50 percent of those units expressed as a percentage is about 0.50 percent. As interest usually centers on the confidence interval for the median at the two-standard-error level, it is necessary to add and subtract twice the standard error obtained in step 1 from 50 percent. This procedure yields limits of approximately 49 percent and 51 percent. By interpolation, 49 percent of units 65 or older had total money income below $18,452 and 51 percent had total money income below $19,246. Thus, the chances are about 95 out of 100 that the census would have shown the median to be greater than $18,452 but less than $19,246.

## Notes

1 Most of the discussion of estimation procedures has been excerpted from Current Population Reports, No. 114 (July 1978).