Gambar halaman
PDF
ePub

Appendix B. Source and Reliability of the Estimates

SOURCE OF DATA

Most of the estimates in this report are based on data collected in March 1986 and 1987 from the Current Population Survey (CPS) of the Bureau of the Census. Some estimates are based on data obtained from the CPS in earlier years and from earlier decennial censuses. The monthly CPS deals mainly with labor force data for the civilian noninstitutional population. Questions relating to labor force participation are asked about each member in every sample household. In addition, questions are asked each March about educational attainment. In order to obtain more reliable data for the Hispanic population, the March CPS sample was enlarged to include all households from the previous November sample which contained at least one person of Hispanic origin. For this report, persons in the Armed Forces living off post or with their families on post are included.

Current Population Survey (CPS). The present CPS sample was selected from the 1980 census files with coverage in all 50 States and the District of Columbia. The sample is continually updated to reflect new construction. The current CPS sample is located in 729 areas comprising 1,973 counties, independent cities, and minor civil divisions in the Nation. In this sample, approximately 61,500 occupied households

Description of the March Current Population Survey

Time period

1986 to 1987. 1985 ...

1982 to 1984.

1980 to 1981.

1977 to 1979..

were eligible for interview. Of this number, about 3,500 occupied units were visited but interviews were not obtained because the occupants were not found at home after repeated calls or were unavailable for some other reason.

The table below provides a description of some aspects of the CPS sample designs in use during the referenced data collection periods.

CPS Estimation Procedure. The estimation procedure used in this survey involved the inflation of the weighted sample results to independent estimates of the total civilian noninstitutional population of the United States by age, race, sex, and Hispanic/nonHispanic categories. These independent estimates are based on statistics from the decennial censuses of population; statistics in births, deaths, immigration and emigration; and statistics on the strength of the Armed Forces. The independent population estimates used to obtain data for 1980 and later are based on the 1980 decennial census. In earlier reports in this series, data for 1972 through 1979 were obtained using independent population estimates based on the 1970 decennial census. The estimation procedure for the data from the March supplement involved a further adjustment so that husband and wife of a household received the same weight.

[blocks in formation]

1973 to 1976.

1972 .....

1967 to 1971..

1963 to 1966.

1960 to 1962.

1957 to 1959.

1954 to 1956.

1947 to 1953.

1Does not include supplemental Hispanic households.

2Three rotation groups were located in 629 areas and five rotation groups in 729 areas.
"Three sample areas were added in 1960 to represent Alaska and Hawaii after statehood.

The estimates in this report for the survey years 1985 to 1987 are also based on revised survey weighting procedures for persons of Hispanic origin. In previous years, the estimation procedures used in this survey involved the inflation of weighted sample results to independent estimates of the noninstitutional population by age, sex, and race. There was, therefore, no specific control of the survey estimates for the Hispanic origin population. During the last several years, the Bureau of the Census has developed independent population controls for the Hispanic population by sex and detailed age groups and has adopted revised weighting procedures to incorporate these new controls. It should be noted that the independent population estimates include some, but not all, illegal immigrants.

RELIABILITY OF THE ESTIMATES

Since the CPS estimates were based on a sample, they may differ somewhat from the figures that would have been obtained if a complete census had been taken using the same questionnaires, instructions, and enumerators. There are two types of errors possible in an estimate based on a sample survey: sampling and nonsampling. The accuracy of a survey result depends on both types of errors, but the full extent of the nonsampling error is unknown. Consequently, particular care should be exercised in the interpretation of figures based on a relatively small number of cases or on small differences between estimates. The standard errors provided for the CPS estimates primarily indicate the magnitude of the sampling error. They also partially measure the effect of some nonsampling errors in responses and enumeration, but do not measure any systematic biases in the data. (Bias is the difference, averaged over all possible samples, between the sample estimates and the desired value.)

Nonsampling Variability. Nonsampling errors can be attributed to many sources, e.g., inability to obtain information about all cases in the sample, definitional difficulties, differences in the interpretation of questions, inability or unwillingness on the part of respondents to provide correct information, inability to recall information, errors made in data collection such as in recording or coding the data, errors made in processing the data, errors made in estimating values for missing data, and failure to represent all units with the sample (undercoverage).

Undercoverage in the CPS results from missed housing units and missed persons within sample households. Overall undercoverage, as compared to the level of the 1980 Decennial Census, is about 7 percent. It is known that CPS undercoverage varies with age, sex, and race. Generally, undercoverage is larger for

males than for females and larger for Blacks and other races combined than for Whites. Ratio estimation to independent age-sex-race-Hispanic population controls, as described previously, partially corrects for the bias due to survey undercoverage. However, biases exist in the estimates to the extent that missed persons in missed households or missed persons in interviewed households have different characteristics from those of interviewed per sons in the same age-sex-race-Hispanic group. Further, the independent population controls used have not been adjusted for undercoverage in the 1980 census.

For additional information on nonsampling error including the possible impact on CPS data when known, refer to Statistical Policy Working Paper 3, An Error Profile: Employment as Measured by the Current Population Survey, Office of Federal Statistical Policy and Standards, U.S. Department of Commerce, 1978 and Technical Paper 40, The Current Population Survey: Design and Methodology, Bureau of the Census, U.S. Department of Commerce.

Sampling Variability. The standard errors given in the following tables are primarily measures of sampling variability, that is, of the variations that occurred by chance because a sample rather than the entire population was surveyed. The sample estimate and its standard error enable one to construct a confidence interval, a range that would include the average results of all possible samples with a known probability. For example, if all possible samples were selected, each of these being surveyed under essentially the same general conditions and using the same sample design, and if an estimate and its standard error were calculated from each sample, then approximately 90 percent of the intervals from 1.6 standard errors below the estimate to 1.6 standard errors above the estimate would include the average result of all possible samples.

The average estimate derived from all possible samples is or is not contained in any particular computed interval. However, for a particular sample, one can say with specified confidence that the average estimate derived from all possible samples is included in the confidence interval.

Standard errors may also be used to perform hypothesis testing, a procedure for distinguishing between population parameters using sample estimates. The most common type of hypothesis appearing in this report is that the population parameters are different. An example of this would be comparing the proportion of young male college graduates to young female college graduates. Tests may be performed at various levels of significance, where a level of significance is the probability of concluding that the parameters are different when, in fact, they are identical.

[ocr errors]

All statements of comparison in the text have passed a hypothesis test at the 0.10 level of significance or better. This means that, for all differences cited in the text, the estimated difference between characteristics is greater than 1.6 times the standard error of the difference.

Comparability of Data. Caution should be used when comparing estimates for 1980 and later, which reflect 1980 census-based population controls, to those for 1972 through 1979, which reflect 1970 census-based population controls. This change in population controls had relatively little impact on summary measures such as means, medians, and percent distributions, but did have a significant impact on levels. For example, use of 1980-based population controls results in about a 2-percent increase in the civilian noninstitutional population and in the number of families and households. Thus, estimates of levels for 1980 and later will differ from those for earlier years more than what could be attributed to actual changes in the population, and these differences could be disproportionately greater for certain subpopulation groups than for the total population.

Care must also be taken when comparing Hispanic estimates over time due to the recent change in weighting of the Hispanic population beginning in 1985. Before 1985, there were no independent population control totals for persons of Hispanic origin. See the section entitled "CPS Estimation Procedure."

Also, in using metropolitan and nonmetropolitan data, caution should be used in comparing estimates for 1977 and 1978 to each other or to any other years. Methodological and sample design changes occurred in these years resulting in relatively large differences in the metropolitan and nonmetropolitan area estimates. However, estimates for 1979 and later are comparable as are estimates for 1976 and earlier. Data on metropolitan and nonmetropolitan residence are not available for 1985.

Decennial Census of Population. The decennial censuses data shown in this report are not strictly comparable to the CPS data. This is due in large part to differences in inter viewer training and experience and in different survey processes. This is an additional component of error not reflected in the standard error tables. Therefore, caution should be used in comparing results between these different sources.

Note When Using Small Estimates. Summary measures (such as medians and percent distributions) are shown only when the base is 75,000 or greater. Because of the large standard errors involved, there is little chance that summary measures would reveal useful information when computed on a smaller base. Estimated numbers are shown, however, even though the relative standard errors of these numbers are larger than those for corresponding percentages. These

smaller estimates are provided primarily to permit such combinations of the categories as serve each data user's needs. Also, care must be taken in the interpretation of small differences. For instance, even a small amount of nonsampling error can cause a borderline difference to appear significant or not, thus distorting a seemingly valid hypothesis test.

Standard Error Tables and Their Use. In order to derive standard errors that would be applicable to a large number of estimates and which could be prepared at a moderate cost, a number of approximations were required. Therefore, instead of providing an individual standard error for each estimate, generalized sets of standard errors are provided for various types of characteristics. As a result, the sets of standard errors provided give an indication of the order of magnitude of the standard error of an estimate rather than the precise standard error.

The figures presented in tables B-1 through B-4 are approximations to the standard errors of various estimates for persons in the United States. To obtain the approximate standard error for a specific characteristic, the appropriate standard error in tables B-1 through B-4 must be multiplied by the factor for that characteristic given in table B-5. These factors must be applied to the generalized standard errors in order to adjust for the combined effect of the sample design and the estimating procedure on the value of the characteristic. Standard errors for intermediate values not shown in the generalized tables of standard errors may be approximated by linear interpolation.

The standard errors in tables B-1 through B-4 and the factors in table B-5 were calculated using the b parameters in table B-5. The parameters may be used directly to calculate the standard errors for estimated numbers and percentages. Methods for computation are given in the following sections.

Standard Errors of Estimated Numbers. The approximate standard error, S., of an estimated number shown in this report can be obtained in two ways. It may be obtained by use of the formula

[merged small][merged small][merged small][merged small][subsumed][ocr errors][merged small]

Table B-1. Generalized Standard Errors for Estimated Numbers of Persons: Total or White (Numbers in thousands)

[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small]

1These values must be multiplied by the appropriate factor in tables B-5 and/or B-6 to obtain the standard error for a specific characteristic. NOTE: (i) To estimate the standard errors for years prior to 1956, multiply the above standard errors by 1.4; for years 1956-66, multiply by 1.14; and for years 1967-79, multiply by 0.93.

(ii) The standard errors were calculated using the formula √ - (b/T) x2 + bx, where b = 2312 (from table B-5) and T is the total number of persons in an age group.

Here x is the size of the estimate, T is the total number of persons in a specific age group, b is the parameter in table B-5 associated with the particular characteristic, and f2 is the appropriate factor from table B-6. If T is not known, for Total or white use 100,000,000; for Blacks and Hispanic use 10,000,000.

Illustration of the Computation of the Standard Error of an Estimated Number. Table 1 of this report shows that in 1987 there were 4,768,000 young adults (ages 25 to 29 years) who were college graduates and 21,636,000 total persons in that age group. Using formula (1) with f, = 1.0 from table B-5, f2 = 1.0 from table B-6, and s = 90,000 from table B-1, the standard error of 4,768,000 is (1.0)(1.0) (90,000) = 90,000. The value of s (= 90,000) was obtained by linear interpolation in two directions in table B-1. The first interpolation was between 10,000,000 and 25,000,000 total persons for both 4,000,000 and 5,000,000 estimated number of persons. The value for 4,000,000 estimated persons was 85.0 and for 5,000,000 estimated persons was 91.7. The second interpolation was between these two values to get the value corresponding to 4,768,000 persons. Alternatively, using formula (2), with the appropriate b parameter of 2312 from table

[blocks in formation]

The 90-percent confidence interval for this estimate is from 4,619,000 to 4,917,000 (using 1.6 times the standard error). Therefore, a conclusion that the average estimate derived from all possible samples lies within a range computed in this way would be correct for roughly 90 percent of all possible samples.

Standard Errors of Estimated Percentages. The reliability of an estimated percentage, computed using sample data for both numerator and denominator, depends upon both the size of the percentage and the size of the total upon which this percentage is based. Estimated percentages are relatively more reliable than the corresponding estimates of the numerators of the per centage, particularly if the percentages are 50 percent or more. When the numerator and denominator of the percentage are in different categories,

Table B-2. Generalized Standard Errors for Estimated Numbers of Persons: Black and Other Races (Numbers in thousands)

[blocks in formation]

'These values must be multiplied by the appropriate factor in tables B-5 and/or B-6 to obtain the standard error for a specific characteristic. NOTE: (i) To estimate the standard errors for years prior to 1956, multiply the above standard errors by 1.4; for years 1956-66,multiply by 1.14; and for years 1967-79, multiply by 0.93.

(ii) The standard errors were calculated using the formula, √b/T) x2 + bx, where b = 2600 (from table B-5) and T is the total number of persons in an age group.

Table B-3. Generalized Standard Errors of Estimated Percentages: Total or White

[blocks in formation]

'These values must be multiplied by the appropriate factor in tables B-5 and/or B-6 to obtain the standard error for a specific characteristic. NOTE: (i) To estimate the standard errors for years prior to 1956, multiply theabove standard errors by 1.4; for years 1956-66, multiply by 1.14; and for years 1967-79, multiply by 0.93.

(ii) The standard errors were calculated using the formula, V(b/x) p(100 - p), where b = 2312 from table B-5.

« SebelumnyaLanjutkan »