Gambar halaman
PDF
ePub

households among those selected to the number of completed households. Since the probabilities of selection are constant within an SSU for 1997, these adjustments were applied at the SSU (ultimate cluster) level.

The NIAF is computed at the SSU level (1,460) cells and is equal to:

Total Competed Plus Uncompleted Responses in the SSU
Completed Responses in the SSU

If the ratio exceeds 2.0, then the NIAF is set equal to 2.0 and the NIAFS for SSUS in the same PSU and with the same metropolitan status are increased.

The First-Stage Ratio Adjustment Factors

The primary purpose of the first-stage adjustment factor is to reduce the sampling variation in the estimates of the number of housing units by main space-heating fuel resulting from sampling of PSUs during the first stage of the sample design. The correlation between main space-heating fuel and other important energy-related characteristics implies that this adjustment will also reduce the sampling variation for many important variables collected for the RECS.

In some cases, a single PSU comprising all or part of a large metropolitan area was large enough in population to be a stratum by itself. PSUs of this type are called Self-Representing (SR) PSUs because the sample from each SR PSU represents only that PSU. The first-stage ratio adjustment factor was 1.0 for all observations in SR PSUs.

In other strata, one PSU was selected from among two or more PSUs in the stratum. Each of the PSUs selected from these strata is called a Non-Self-Representing (NSR) PSU because each such PSU represents not only itself; it also represents the unselected PSUs in the stratum.

The 1990 Census data were used to determine the difference between the distribution of the main space-heating fuel in the set of selected NSR PSUs and the distribution in the set of all PSUs (selected and unselected) in the strata from which the NSR PSUs are selected. Fuels are under-represented if the percentage of households using the fuel is lower in the selected NSR PSUs than the percentage in the set of all PSUs in the NSR strata. Fuels are over-represented if the opposite occurs. The weights for the responding households in NSR PSUs are adjusted upward when their main space-heating fuel is under-represented and the weights are adjusted downward when it is over-represented.

The Second-Stage Ratio Adjustments

The second-stage ratio adjustments are used to improve the accuracy of the estimates of the number of households using data obtained from the Bureau of the Census as control totals. The RECS can be used to produce an estimate of the number of households in the country, but the Bureau of the Census produces much more accurate estimates. Improving the accuracy of the data on the number of households also improves the accuracy of almost all other estimates obtained from the RECS. The first priority is the accuracy of estimates for the number of households for the nine Census divisions and for the four largest States. The second priority is the accuracy of estimates for the number of households for three demographic cells (multi-person households, single-member female households, and single-member male households). The ratio adjustment process was carried out in three steps. In step one, the population was divided into 15 geographical cells. (Hawaii and Alaska were treated as separate cells because their climates are different than that of the rest of the country.) Control totals giving the number of households in each cell were derived from Current Population Survey results. A ratio adjustment equal to the control total divided by the weighted count using the weights after the first-stage ratio adjustment was created. Multiplying the weights after the first-stage ratio adjustment by the ratio yields the new weights which, when summed, equal the control totals for the 15 cells. This calculation yielded a weighted total number of households equal to 101,481,000. Refer to Table B1 for estimates for each of the 15 geographical areas.

Energy Information Administration

The second step was similar to the first step. The two differences were the input weights and cells used for control totals. The input weights are those resulting from the first step. The following three categories were used to define the cells:

1. One-person households, male householder

2. One-person households, female householder

3. All other households.

The purpose of this second step was to reduce possible bias in the RECS sample due to undercoverage of one-person households, particularly, those comprised of a single male.

The third step is the same as the first step except that the input weights are those resulting from the second step. This produced a set of weights whose sum reproduced the 15 geographic cell control totals and yielded estimates that are quite close to the control totals for the three demographic cells.

Table B1. U.S. Population Estimates Used as Controls in Ratio Adjustment of Sampling
In the 1997 RECS

[blocks in formation]

Source: Linear extrapolation from U.S. Bureau of the Census, 1997 Current Population Survey.

Adjustments Item Nonresponse

Item nonresponse occurs when respondents do not know the answer or refuse to answer a question or when an interviewer does not ask a question or does not record an answer. The incidence of the latter, the interviewer not asking and/or not recording the answer, was greatly reduced by the use of Computer Assisted Personal Interviewing (CAPI). The majority of nonresponse was due to interviewers recording answers of "Don't Know" and "Refused." Some item nonresponse was due to programming problems in the questionnaire.

Adjustments for Item Nonresponse

The "Hot-deck" imputation was the method used most frequently (Table B2). The hot-deck procedure requires sorting the file of households by variables related to the missing item. A household is then selected that has the same value for the related variables, and this "donor" household supplies the value for the variable that is missing in the "donee" household.

Energy Information Administration

Less frequently used imputation methods included random selection from the known values of variable, deductive, and allocation procedures.

The random-selection procedure was used primarily to impute for continuous numerical values and missing numbers that were conditional on other numbers (e.g., number of ceiling fans used was conditional on the number of rooms in the home).

[blocks in formation]

'There are an additional 54 questionnaire items for which there were no missing

values or for which values were determined by explicit editing rules in the initial stages of questionnaire editing. Source: Energy Information Administration, Office of Energy Markets and End Use, Form EIA-457 A of the 1997 Residential Energy Consumption Survey (RECS). RECS Public-Use Data Files.

Deductive procedures were used primarily for missing information on fuels used for specific purposes and on methods of payment for fuels. The amount of missing data on these items was generally quite small. Other information available from the questionnaire or from related data sources (utility bills and rental agent survey) provided reasonably accurate assignments for the missing data.

Allocation procedures use explicit rules for assigning values to missing information about a householder, such as age and sex. The procedures are based on information on these variables for the household as a whole.

Table B3 lists the most frequently imputed items in the 1997 RECS. The amount of item imputations for the 181 households receiving mail questionnaires was considerable, since these questionnaires contained only a small subset of questions from the household interview. For the mail questionnaires, a modified hot-deck imputation method was used. A hot-deck matrix was created for mail questionnaires and personal-interview households by using Census region, type of housing unit structure, space-heating fuel, hot-water fuel, and presence and type of air-conditioning. Whenever possible, a donor personal-interview household was chosen for each mail questionnaire household from the same cell of the hot-deck matrix. For 90 percent of the mail questionnaires, donors matched on all hot-deck variables.

Because each cell of the matrix usually contained several possible donors, a donor was chosen from the cell on the basis of how closely it matched the mail questionnaire household on a number of additional variables. These variables were income, number of household members, number of household vehicles, age of householder, tenure, number of rooms, and household structure (married couple, other). The entire set of responses from the donor household was imputed to the mail questionnaire household. This means that all responses for mail questionnaire households are imputed except

Energy Information Administration

for the following: weather data, fuel-consumption data acquired from the household's energy suppliers, the geographic location of the mail questionnaire household and those items in the hot-deck imputation process for which an exact match was obtained.

Table B3. Household Questionnaire Items Most Frequently Imputed in the 1997 RECS

[blocks in formation]

'Mailed interviews are not included in the percentage. To account for these, add three percentage points to the percentage points given.

Source: Energy Information Administration, Office of Energy Markets and End Use, Form EIA-457 A of the 1997 Residential Energy Consumption Survey (RECS). RECS Public-Use Data Files.

Nonsampling Error

Nonsampling errors can occur for the following reasons:

Differences between the target population (residential sector) and the population from which the sample is selected (occupied primary residential housing units)

Interviewer errors, respondent misunderstandings, questionnaire-design errors, and data-processing errors

Energy Information Administration

Nonresponse on certain questions from the questionnaire for some respondents (item nonresponse).

"Quality of Specific Data Items" discusses the derivation of some statistical data and reviews some of the nonsampling errors that occur for the second, third, and fourth reasons in the list above. These errors would be expected to occur even if the survey attempted to contact the occupants of every occupied housing unit in the country. (For example, the results of the Decennial Census conducted by the Bureau of the Census are subject to these nonsampling errors.)

Quality of Specific Data Items

The use of the CAPI system dramatically reduced the incidence of item non-response, particularly those non-responses due to interviewer error. In 1993, there were approximately 300 variables imputed. Of these, approximately 50 variables were missing data for 10 or fewer cases; approximately 40 variables were missing data for more than 100 cases. The vast majority of missing data was due to "No Answer" (meaning either the interviewer didn't record a response for that questions or the responses was determined to be inconsistent in view of other questionnaire responses) rather than "Don't Know" or "Refused." In 1997, we imputed approximately 145 variables. Of these, approximately 80 variables contained missing data for 10 or fewer cases, while only about six variables contained missing data for 100 or more cases. Most of the missing data was due to recorded responses of "Don't Know" or "Refused."

Housing Unit Type

There is a fine line between the definitions of various types of housing units. The distinction between a single-family attached unit and a unit in an apartment building is particularly complex. The collection and editing of the data on housing type changed from the paper-and-pencil questionnaire for the 1993 RECS to the CAPI questionnaire for the 1997 RECS. The change in the data collection and editing procedures may have contributed to changes in the survey results. For example, the estimated number of occupied single-family attached units increased from 7.3 million for the 1993 RECS to 10.0 million for the 1997 RECS. Conversely, the number of occupied housing units in buildings with two to four units decreased from 8.0 million for the 1993 RECS to 5.6 million for the 1997 RECS.

Programmable (Set-Back or Clock) Thermostats

The 1993 and 1997 RECS both contained questions on the presence of a programmable thermostat. In both surveys, the thermostats were referred to as "set-back or clock thermostats," but not programmable thermostats. For the 1993 RECS, the question was placed in the section on conservation measures and usage (following questions on insulation, weather stripping, and caulking). For the 1997 RECS, it was placed in the space-heating section, immediately following the question on the presence of a thermostat. The 1997 RECS also included a question that asked respondents if they programmed the thermostat or used the manual features. Based on the 1993 RECS, an estimated 10.8 million households had programmable thermostats in 1993. Based on the 1997 RECS, an estimated 44.9 million households had programmable thermostats in 1997. Of these 44.9 million, an estimated 11.7 million programmed their thermostats and an estimated 33.2 million used the manual features.

The large increase in the number of housing units with programmable thermostats from 1993 to 1997 is questionable. The change in the placement of the question may have contributed to the large change in the survey results. In addition, the question concerning programmed versus manual use of the thermostats may have changed how the interviewers coded the question on the presence of a programmable thermostat.

Energy Information Administration

« SebelumnyaLanjutkan »