Gambar halaman
PDF
ePub

Post-Enumeration Survey enumerators were provided with transcriptions of the original data so that after independently doing a reinterview, they reconciled discrepancies between reinterview responses and original responses.

In brief, in this phase of the Post-Enumeration Survey, results obtained by an improved method of interview were expected to provide estimates of bias in census enumeration. Although there were some exceptions, resulting estimates of net error tended to be quite small, even in some situations where other evidence indicated it was not small.

1950 record checks.--Record checks conducted as part of the 1950 evaluation program included comparisons of 1950 census data with data on birth certificates, records of the 1920 census, income tax returns, social security records, alien and naturalization records of the Immigration and Naturalization Service, and records of the Veterans Administration. In general, however, with the procedures followed it was possible to locate check data for only about 50 to 80 percent of the persons in the samples investigated. Provision was not made for field work to identify and reconcile unmatched cases, and to a considerable extent the results were inconclusive.

Scope of the 1960 Content Error Studies

Five studies were included in Project E. Two were reenumerative studies and three were record checks. One of the reenumerative studies was directed primarily toward estimating the error in the statistics of demographic characteristics, and the other, the error in the statistics of housing characteristics. These studies also yielded some information on gross differences, or simple response variance.

Reenumerative Studies of Content Error. --The first reenumerative study of content error employed intensive interviews to measure error in population characteristics. It had some features in common with the 1950 Post-Enumeration Survey. Intensive interviews were conducted at 5,000 households which were in the 25-percent sample in the 1960 census. Specially trained enumerators probed intensively for the best possible answers concerning selected population characteristics. Most of the characteristics chosen for study were of a type that would not change or would change very little with the passage of a few months of time. Also, specified persons were designated to be respondents.

The first phase, covering about 1,500 households, was conducted in July 1960, and for this part of the study enumerators were not given access to the original census schedules. Data collected in the intensive interviews were then matched in the office with the data collected in the original census enumeration, and a review was made to determine which cases had sufficient discrepancies to be sent back to the field for reconciliation.

The second phase of the study covered a different sample of 3,500 households and took place in October 1960. For one-half of the sample, enumerators were given the original census responses and were instructed that any differences between the responses they obtained and the census responses were to be reconciled on the same visit after the independently conducted interview was finished. For the other half, independent, unreconciled interviews were conducted. For both, office examination of the data and careful editing procedures were employed to evaluate reinterview responses. Net differences between original responses and reinterview responses provide estimates of content bias with respect to population characteristics.

The second reenumerative study of content error concerned housing characteristics. An intensive interview by specially selected and trained enumerators was carried out for 10,000 housing units, about half of which were sample units in the census. Information was collected on tenure and rent, plumbing, and costs of water and fuel. Detailed questions were asked to attempt to determine the best answers and the factual basis for such answers.

At this writing, a considerable amount of tabulation has been completed for both the population and the housing content error studies.

CPS-Census match. --One study of content error was based on a match of individual returns obtained by the Current Population Survey (CPS) and the 1960 Census of Population. The CPS is the primary source of current data on the labor force and of periodic reports on other demographic data. Most information obtained by the CPS is generally regarded as being of higher quality than that obtained in the census because it is obtained by a permanent staff of highly trained and closely supervised interviewers and because of highly developed survey control methods. Therefore, CPS data were used to provide a standard for measuring the quality of census data on the labor force and other population characteristics.

The CPS is conducted monthly with a partially rotating sample of households; each month onefourth of the households are dropped from the sample and replaced. It yields a sample of 35,000 interviewed households in any one month. Those households which were in both the 25-percent sample of the population census and the March or April 1960 CPS were included in the study.

The first step in the CPS-Census Match was to examine the census stage I enumeration books to determine whether the CPS households were in the census sample. Those CPS households identified as being in the 25-percent census sample were then matched to the census stage II returns, and procedures were set up to transcribe and code the census information and CPS identification items. Data were then tabulated for identical persons as reported by the census and by the CPS enumerators.

Because the 25-percent sample of the population census was used, this percentage set the upper limit to the percentage of CPS cases that could be matched. Furthermore, various processing problems, timing, and coverage considerations precluded attaining this level. For the comparison of reported labor force status, the number of cases that eventually were matched amounted to 17,337 persons 14 years and over, or 92.9 percent of the possible match universe and 23.2 percent of the CPS panel as compared with the theoretical 25 percent.

At this writing, a considerable amount of tabulation has been completed.

Employer Record Check. --This study was designed to obtain information on the comparability of census reports made by respondents concerning their occupation and industry with corresponding information obtained from their employers.

Occupations as reported by employees were matched with occupations as reported by their employers for a sample of employees reported in the census. In addition, the classification of the industry of these employers was identified in records of the Bureau of Old Age and Survivors Insurance, and comparisons were made of these classifications with industry as reported in the population census.

At this writing, some estimates of content error in occupation and industry have been made and are being prepared for publication. The remaining processing for this study is at an advanced stage. Internal Revenue Service Record Check. --This study was planned to yield an estimate of content error with respect to income. A sample of Internal Revenue Service returns was selected on a probability basis. Only about one-fourth of these, or about 2,500, were to be included in the study because census income data appear only for the stage II (sample) census households. Pertinent information was to be transferred from the census returns to magnetic tape and a similar operation was to be separately performed for the Internal Revenue Service returns. The electronic computer was then to handle the matching operations.

At this writing, this study is in the initial processing stage.

PROJECT F, STUDY OF PROCESSING ERRORS

The word "processing" as used here includes all handling of data beyond the initial recording of a response. The two-stage method of census enumeration required copying or transcription at more than one stage. Responses were then edited and coded. The census documents were filmed and the data transferred to computer tapes.

Three areas of study, described below, were defined as having special importance for the measurement of processing error.

Field Transcription Error

In stage I of the two-stage census, there were two types of transcription: (1) the transfer of data from the Advance Census Report, which had been filled in by the householder, to the 100-percent FOSDIC schedule, and (2) the copying of data from the 100-percent FOSDIC schedule to the sample FOSDIC schedule for sample households. In stage II of the enumeration, a key element in the procedure was the transfer of the sample data from the Household Questionnaires, which had been filled in by the householders, to the sample FOSDIC schedules.

At this writing a study is being planned to review a sample of the Advance Census Reports, Household Questionnaires, and 100-percent and sample FOSDIC schedules. It is planned to make estimates of the extent to which transcription errors contributed to the net and gross errors and also to the correlated response errors.

Coding Error

As a part of Project A, Measurement of Response Variability, estimates are to be derived of the contribution to the correlated response variability arising from coding variability during the general coding and the industry and occupation coding of the sample data.

A separate coding-error study has been conducted, largely as a by-product of the quality control scheme used in the 1960 census, with a sample of 1 in 40 households from the 25-percent sample for whom occupation and industry data were collected. Three different coding clerks with approximately the

same training and coding experience all coded independently from the census schedule, but only one, the "census coder," entered his code on the census schedule. The coded results were then matched.

This Project F study has the additional objectives of providing estimates of coding bias and the simple response variance arising from coding. This involved examining the three sets of codes to attempt to establish the correct codes and to measure the extent to which the codes assigned by the census coders differed from the correct codes.

At this writing, a report1 has been written and a considerable number of tabulations have been completed relating to coding errors in occupation and industry. The tabulations are available on coding errors in "general" coding, i.e., all coding except industry and occupation coding.

Editing and Allocation

In the 1960 censuses, the microfilm-FOSDIC-computer complex performed jobs formerly done by editing and coding clerks and punchcard equipment. The close monitoring of this aspect of processing through a comprehensive program of quality control has been described (see chapter 8, Processing the Data). The high reliability of the electronic equipment assured that far fewer errors were made in accomplishing the specified steps in editing and tabulation than were produced by methods used in earlier censuses. Editing of the data was done uniformly, in accordance with rules and instructions supplied to the computer.

At this writing a study is being planned to evaluate the editing rules, particularly the rules for handling missing data, principally by comparison with the results obtained in the re-enumerative surveys.

PROJECT G, ANALYTICAL STUDIES

Analytical studies for the evaluation of the census data are to include demographic and actuarial analysis and various comparisons of the census results with data available from noncensus sources. The analytical studies were conceived for the general purpose of overall evaluation of the census data and also to contribute to the understanding of the strong points and limitations of the various measurements of coverage and content error made through other studies in the Evaluation and Research Program.

Some analysis of coverage error has been completed.2 Other analyses are now being worked on as parts of 1960 census monographs.

PROJECT H, POST OFFICE COVERAGE IMPROVEMENT STUDY

Project H involved the use of Post Office resources and personnel to identify households erroneously omitted from the census enumeration within a sample of areas. In addition, this project was directed toward study of the feasibility of carrying out this type of field work by a decentralized census-staff operation. Each of the District Offices in the sample conducted the study in its district with only written instructions, and the post offices in the study also operated on the basis of written instructions, without special training or supervision from Washington personnel.

Within each of the 15 postal regions of the continental United States, a sample area containing from 10,000 to 15,000 housing units was selected. Enumeration districts served wholly or in part by the post offices in the sample areas were identified. Within these ED's, the census enumerators, during the course of the census enumeration, filled out printed address cards, giving the name and address for each enumerated household. The cards, except for a small sample which was withheld, were turned over to the local post offices. There they were sorted like mail to be delivered by carrier route, and given to the postal carriers. The postal carriers were asked to make up new cards for any households on their routes that were not represented. Personnel of the local Census District Offices matched the new cards supplied by the postal carriers against the census schedules. Households that could not be located on the census schedules were visited and data equivalent to stage I information were collected from them for the dual purpose of checking on possible reasons for the enumerator's omission and for collecting information for analysis of the characteristics of missed units.

Duplication of enumeration or erroneous listing of housing units as separate could result in overcounts as well as undercounts, and provision was made in the study for field investigation of households where overcounts might have occurred.

1 Fasteau, Herman H.; J. Jack Ingram; and Ruth H. Mills, "Study of the Reliability of Coding of Census Returns," in: American Statistical Association, Proceedings of the Social Statistics Section, 1962, Washington, D.C., 1963? pp. 104-115.

2

Akers, Donald S., "Estimating Net Census Undercount in 1960 Using Analytical Techniques," 1962, 8, 5, 6 pp., processed (presented at the annual meeting of the Population Association of America, May 1962.)

At this writing, preliminary tabulations designed to throw light on the effectiveness of this procedure have been completed.

Coverage

A PRELIMINARY EVALUATION OF THE 1960 CENSUS OF POPULATION

Considering the evidence now available (May 1963), it appears that a minimum reasonable estimate of the net underenumeration of the population in 1960 is in the range of 1.7 to 2.0 percent of the total as compared to the "minimum reasonable" estimate of 2.4 percent for the 1950 census. This amounts to a net undercount in 1960 of 3,000,000 to 3,500,000 persons. Some modification of these estimates may be made when further evaluation results become available.

An analysis of the 1960 and 1950 counts of the total resident population and of estimates of the components of population change for the period suggest that coverage in the 1960 census was somewhat better than in the 1950 census:

[blocks in formation]

If this estimated increase in coverage is subtracted from the 1950 net underenumeration as found by the 1950 post-enumeration survey (2,091,000 - 277,000 = 1,814,000), then the net unde renumeration of 1.4 percent in 1950 would be reduced to 1.0 percent in 1960.

However, other evidence regarding the 1950 census suggested that the post-enumeration survey had failed to indicate all the unde renumeration, and the Bureau of the Census accepted the figure of 2.4 percent or about 3,700,000 as the minimum reasonable estimate of net underenumeration. If the estimated increase of coverage of 277,000 is subtracted from this figure, then a 1960 net underenumeration of 1.9 percent of the 1960 population, or about 3,400,000 persons, is indicated as the minimum reasonable estimate.

Errors in the intercensal estimates of births, deaths, and military movement are not likely to be of sufficient magnitude to affect the general picture regarding the relative accuracy of the 1960 and 1950 counts. Estimates of net movement of citizens (exclusive of those moving between Puerto Rico and the Mainland) are subject to a very wide range of error. Two approaches to estimating this movement, neither of which can be regarded as highly reliable, yield estimates varying from a net inmovement of about 280,000 to a net out-movement of about 170,000. In the above estimate this net movement was taken as zero. There is another factor in the 1960 census count which might affect the picture: 776,655 persons were included through computer imputation of population to housing units for which there was evidence of occupancy though there were no FOSDIC-readable persons on the population schedule. There is some evidence that the computer may have "overimputed" persons. The amount of this overcount cannot be accurately determined, but it is estimated to be from 100,000 to 350,000.

A comparison of findings from the 1950 post-enumeration survey and of preliminary results from reenumerative surveys (Project D) conducted as part of the 1960 census evaluation program shows the following estimates of coverage errors, as percents of census total population:

[blocks in formation]

These estimates are not entirely comparable for 1960 and 1950. As a consequence of weakness detected in the 1950 post-enumeration survey procedures, steps were taken to strengthen corresponding procedures in 1960. Therefore, for 1960, there are higher, and perhaps more reasonable, estimates of numbers of persons missed or erroneously included. These higher estimates may also reflect differences in the coverage of the censuses.

A composite of demographic analytic methods1 gives some estimates of net undercounts for 1960 by sex, age, and color, in percentages:

[blocks in formation]

The table below shows a comparison of estimates for 1960 and 1950 for undercounts of children under one year of age and under five years, for whites and nonwhites. The results in both tables reflect the accuracy of age, color, and sex reporting as well as completeness of coverage. The figures below are based on estimates of survivors of births, using birth registration data and estimates of underregistration made by the National Office of Vital Statistics. Each of the age and color groups shown indicates considerable improvement in 1960.

[blocks in formation]

Further evaluation of the 1960 census coverage (as part of Project ) will be possible as final results become available from the entire series of studies described earlier in this chapter.

Content

Nonresponse rates.--The table below provides a comparison of nonresponse rates in the 1960 and 1950 censuses for a few characteristics. Most of the nonresponse rates compared are higher in 1960 than in 1950.

In 1950, after an enumerator had made reasonable but unsuccessful efforts to obtain census information from a responsible member of the household, he was instructed to make inquiries from neighbors. This procedure was followed in 1960 only for the 100-percent items for which it was presumed that neighbors might provide reasonably acceptable information. In 1960, for the sample items, the procedure involving self-enumeration and mail-in may have encouraged some failure to followup on nonresponse. The enumerators were instructed to followup as needed, but to obtain population information only from a responsible member of the household.

The 1960 procedures were based on the assumption that allowing information to be obtained from neighbors and other unqualified respondents encouraged poor standards and loose work, and that with a reasonably low response rate, mechanical imputation yielded more reliable data than inquiry of neighbors. In both 1950 and 1960 there may have been some informal imputation by enumerators. For some items (such as place of birth, mother tongue, occupation, place of work, and means of transportation), nonresponses were not imputed but were tabulated as NA's.

1 Estimates for ages under 25 are based on survivors of births; those for ages 25 to 64 on Coale's iterative method (see bibliography at end of this chapter); those for ages 65 and over on an iterative method using mortality data.

« SebelumnyaLanjutkan »