« SebelumnyaLanjutkan »
The standard errors shown in tables R and S are not directly applicable to differences between two estimates. These tables are to be applied differently in the three following types of differences:
1. The difference may be one between a sample figure and one based on a complete count, e. g., arising from comparisons between 1950 data and those for 1940 or earlier years. The standard error of a difference of this type is identical with the variability of the 1950 estimate.
2. The difference may be one between two sample estimates, one of which represents a subclass of the other. This case will usually occur when a residual of a distribution is needed. For example, an estimate of the number of persons in the United States who are 14 years of age and are not enrolled in school can be obtained by subtracting the estimate of the number enrolled as shown in table 110 from the sample estimate of the total number 14 years of age. Tables R and S can be used directly for a difference of this type, with the difference considered as a sample estimate.
3. The standard error of any other type of difference will be approximately the square root of the sum of the squares of the standard error of each estimate considered separately. This formula will represent the actual standard error quite accurately for the difference between estimates of the same characteristic in two different areas, or for the difference between separate and uncorrelated characteristics in the same area. If, however, there is a high positive correlation between the two characteristics, the formula will overestimate the true standard error.
Some of the tables present estimates of medians (e. g., median years of school completed, median income) as well as the corresponding distributions. The sampling variability of estimates of medians depend on the distributions upon which the medians are based.14
It is possible to make an improved estimate of an absolute number (improved in the sense that the standard error is smaller) whenever the class in question forms a part of a larger group for which both a sample estimate and a complete count are available. This alternative estimate is particularly useful when the characteristic being estimated is a substantial part of the larger group; when the proportion is small, the improvement will be relatively minor. The improved estimate (usually referred to as a "ratio estimate") may be obtained by multiplying a percentage based on sample data by the figure which represents the complete count of the base of the percentage.
The effect of using ratio estimates of this type is, in general, to reduce the relative sampling variability from that shown for an estimate of a given size in table R to that shown for the corresponding percentage in table S. Estimates of these types are not being published by the Bureau of the Census because of the much higher cost necessary for their preparation than for the estimates derived by multiplying the sample results by five.
14 The standard error of a median based on sample data may be estimated as follows: If the estimated total number reporting the characteristic is N, compute the number N/2-√N. Cumulate the frequencies in the table until the class interval which contains this number is located. By linear interpolation, obtain the value below which N/2-√N cases lie. In a similar manner, obtain the value below which N/2+√N cases lie. If information on the characteristic had been obtained from the total population, the chances are about 2 out of 3 that the median would lie between these two values. The chances will be about 19 out of 20 that the median will be in the interval N computed similarly but using±2N and about 99 in 100 that it will be in the inter2 val obtained by using±2.5√N.
Table R.-STANDARD ERROR OF ESTIMATED NUMBER FROM 20-PERCENT SAMPLE
1 An area is the smallest complete geographic area to which the estimate under consideration pertains. Thus the area may be the United States, a region, division, State, city, standard metropolitan area, urbanized area, or the urban or rural portion of the United States. The rural-farm or rural-nonfarm population, the nonwhite population, etc., do not represent a complete area.
2 or 98.
5 or 95.
10 or 90.
25 or 75.
Table S.-STANDARD ERROR OF ESTIMATED PERCENTAGE FROM 20-PERCENT SAMPLE
Table 44 Revised.-YEARS OF SCHOOL COMPLETED BY PERSONS 25 YEARS OLD AND OVER, BY COLOR AND BY SEX, FOR THE UNITED STATES, URBAN AND RURAL, 1950, AND FOR THE UNITED STATES, 1940 [Asterisk (*) denotes statistics based on 20-percent sample. For totals of persons 25 years old and over from complete count for 1950, see table 38. Percent not shown where less than 0.1 or where base is less than 500]