Gambar halaman
PDF
ePub

Chapter VI-4. METHODOLOGY TO BE FOLLOWED IN FINAL REVIEW

[blocks in formation]

of the latter should be left to paraprofessionals whose role is discussed in the section on staffing for final review (chapter VI-1).

Essential to an effective review is the approach of resolving the problem case when it occurs and not delaying that action. Due to the tight time schedule, telephone calls rather than letters should be employed to obtain missing data or resolve questionable entries. Final review is structured as a single round of corrections with early publication of a limited amount of data on an advance basis and not two rounds of correction and publication. The scope and depth of the final review are, of course, conditioned by the staff resources available to the statistical office. Also, post-tabulation review is dependent on the quality of the processing done earlier; that is, on the number of cases examined and the extent to which errors were found and corrected.

The methodology of the review is to isolate, by a series of tests, a limited number of aggregates to check into and correct if necessary. The tests consist of (a) data relationship checks, (b) zero balance checks, (c) examination of large changes made in pretabulation edit and imputation, (d) comparisons with related data, and (e) the evaluation of

general reasonableness of the data using professional judgment. In essence, these checks parallel the edits performed during the pretabulation stage with the difference that they are applied at the total level rather than the individual establishment level.

2. DATA RELATIONSHIP CHECKS

Prior to the tabulation stage, there occur a number of manual and machine tests of the validity of the establishment data. The tests are designed to validate such relationships as wages per production worker, mandays per production worker, and value added as a percentage of shipments (see exhibit VI-3-1).

For the final review, these relationships are examined in terms of industry, geographic area, size class, or cross-classifications of these categories. Appropriate adjustments are made in the tolerance limits used for pretabulation edit; this usually consists of a narrowing of the parameters due to the statistical impact of a sizeable number of observations in most of the data cells. At the statistical aggregate level (the smallest area being reviewed), the tolerance will be set with wider limits as the number of establishments in the data cells is smaller and vice versa.

It is critical to set the tolerances so as to maximize printing out for review those cells that should be corrected or verified and to minimize the examination of cells that are not truly suspect. During the early processing stages, the analysts should examine the results carefully to determine whether the tolerance limits are satisfactory.

PROVIDENCIA: A Case Study in Economic Censuses

It is considered that a good balance is achieved in setting tolerance limits if about 50 percent of the rejected cells are revised, as indicated in illustration A.

Illustration A. Acceptable tolerance limits

REJECTED CELLS

50 PERCENT REVISED

If only 10 percent of the rejected cells are revised, this indicates that the tolerances resulted in too many cases being rejected and reviewed (see illustration B).

Illustration B. Unacceptable tolerance limits--too many cells rejected

REJECTED CELLS

10 PERCENT REVISED

On the other hand, if 90 percent of the rejected cells are revised, then probably not enough cases were rejected and reviewed (see illustration C).

Illustration C. Unacceptable tolerance limits--too few cells rejected

REJECTED CELLS

90 PERCENT REVISED

If an individual relationship is significantly different from the average and cannot be explained in terms of the review or by the

professional's knowledge of the situation, it should be looked into. For example, if payroll per employee in the food industries in one of the provinces were twice as high as those in most other provinces, this would justify an examination of the establishment data to find an explanation or to locate one or a limited number of reports in the tabulations still seriously in error. To implement that review, two edit runs are made. One run flags, by industry, the outlier establishments (that is, those establishments at the extremes of the distributions of specified characteristics) and which are above a certain employment size (for example, 20 employees). The other prints out, for each of the suspect geographic cells (province or county), those outlier establishments from the industry run which are located in the particular province or county. The establishment size cutoff directs the analyst's attention to those cases that significantly affect the tabulated aggregate.

3. ZERO BALANCE CHECKS

In this phase of the final review, the set of tabulations is subjected to the following test: "Are amounts which should be equal to each other actually equal?" These are equality checks. They are most frequently employed in comparing stub and column detail with totals in statistical tables. Variations of the test range from two to a dozen detailed items that should add to a total, either horizontally or vertically in the table. Sometimes the check consists of assuring that four quarterly figures add to an annual total. The nature of the balance test is determined by the design of the particular table.

4. COMPARISONS WITH RELATED DATA

Several data series were mentioned in an earlier chapter (section 5 of chapter VI-3) as prime candidates for this segment of the final

[blocks in formation]

C now be added (for some countries) administrative records such as the number of persons or payrolls covered under social security systems. I Such totals from an independent source are very valuable in assessing the completeness of the census tabulations. Another example of related data is information on exports which can be used to test the reliability of production data on minerals and manufactured products when the bulk of such products is exported.

Certain items of information of the type collected in the census also exist from other sources, as for example, the monthly and annual statistics on sugar refining and coffee roasting in Providencia; these should be compared with the corresponding census figures.

point in the final review, supplementing and weighing the evidence afforded by the sophisticated but still mechanistic tests described above. The trained analyst will treat a battery of statistical tests as a series of simultaneous equations, the solution to which he is uniquely able to achieve. This is not to suggest that judgment can be used as an automatic override over statistical edits or be employed in other than a balanced, restrained manner as one element, although a critical one, in the whole review effort.

In addition to matching against current data sources the final review should compare the tabulated totals with corresponding census totals for previous periods, if such exist. Such comparisons are made by industry, geography, and size groupings for principal items of data such as total employment and value added. As in the case of internal ratio tests, extraordinary changes from previous periods should be checked. This set of procedures applies especially to countries whose censuses occur fairly close together; for example, no more than 5 years apart. For Providencia, the gap of 12 years between industrial censuses limits the usefulness of this historical com

parison check. The overwhelmingly difficult task of matching individual establishments over a great span of years precludes a censusto-census comparison at the micro level.

[blocks in formation]
[merged small][ocr errors][merged small]

6.1 Complete establishment records

It is crucial to have in hand the appropriate analytical tools to review problem areas at the post-tabulation stage. In addition to the outlier listings (rare or extreme values) at the aggregate level, it is necessary to have machine listings of individual establishments to check where the statistical aggregates are questionable. The listing should contain a master record of the establishment as finally entered into tabulation. There are two arrangements of the establishment record. The one most frequently used in the final review is organized by industry, by descending size class within industry and by identification number (ID) within size class. For the 1975 Industrial Census of Providencia, the industry is the

PROVIDENCIA: A Case Study in Economic Censuses

four-digit ISIC and the size class is in terms of total persons engaged. Such a listing helps greatly, for example, if a question arises as to whether value of shipments per person engaged is too high for the machine tool industry. In that case the reviewer could scan the value of shipments column in the industry ID listing to ascertain whether some establishments appear to have shipment values grossly out of line with their size as indicated by total persons engaged.

The other listing is one in which the establishments are listed within geographic units (province and county). In this arrangement the four-digit ISIC is the second basis for classification, employment size class the third, and ID number the lowest level.

6.2 Log of exceptional cases resolved earlier

The final review will be helped greatly if exceptional situations relating to large establishments that are encountered and verified (or corrected) in various processing steps are logged (written down) and a record of them given to the professional analyst who reviews the data. These conditions might involve a birth, death, change in industry code, or an exceptionally high or low wage rate, materials cost, or output per employee. Access to such a record will complement the other analytical tools available to the analyst and significantly facilitate his final review effort. It would be wasteful indeed to retrace steps that had laboriously proved the accuracy of a reported condition which at first appeared to be unusual or even impossible.

6.3 Check totals

In addition to the above, records of the number of questionnaires should be rigorously

maintained at each stage of the processing stream. Once the total number of questionnaires mailed to the skip-list establishments and those obtained through interviews at nonskip-list establishments has been established, that count should be satisfied at each juncture of the processing stream--questionnaires received, coded, edited, and tabulated. number received should be divided between those with data and those that are blanks; for those on the mail list, there will be a separate figure for postmaster returns (i.e. no adequate address, no longer at that location, etc.)

The

If there is mechanical editing of the returns, counts should be kept of the number of machine (computer) rejects, their resolution, and re-entry into the processing. Once the number of processable reports has been determined (i.e. reports with sufficient data that can be coded industrially and geographically, edited, and tabulated), that total should be utilized through the final stage of tabulation. The published tables should show the same number, except for counts of cases explicitly removed from the processing for designated reasons, e.g., duplicates, outof-scope, etc.

The success of an industrial census (or more frequent inquiry, for that matter) depends very largely on the care and skill with which reports from the large establishments are handled. A highly skewed distribution of value of output, value added, or employment can result from the fact that large establishments affect the survey totals in a manner disproportionate to their number. The definition of large will vary by country but, for purposes of special handling and separate control on the establishment counts described above, a "rule of thumb" of 50 employees for manufacturing and 20 for minerals is a useful starting point.

Another type of check that is effective

in final review is the validation of the totals for key items in the preliminary tabulation. These preliminary figures should be used as an initial point of departure for reviewing the final data. Except for late changes in the preliminary totals which were not carried back to the machine records (and for which adjustments must be made), the figures shown in the final tabulations (before review) should be a mirror image of the published preliminary data. With allowances for corrections arising from final review, a complete reconciliation should be possible between

the preliminary and final tabulations of the census.

6.4 Documentation

The methodology and procedures employed in final review of the data should be documented and made a part of the procedural history of Providencia's Industrial Census. That history should cover every phase of the census undertaking from the initial planning and budgeting to final publication of the results. An important feature of such a listing is an objective evaluation of the merits and limitations of the procedures followed in the entire census operation.

« SebelumnyaLanjutkan »