Table of Contents

B        Appendix B:  Methodology

Study Design and Methodology

Sample Design

Data Collection Method

Data Processing

Sample Demographic Profile

Data Weighting and Expansion


 

Appendix B:  Methodology


Study Design and Methodology

The U.S. Postal Service Household Diary Study (HDS), conducted by NuStats on behalf of the Volume and Revenue Forecasting division of the Postal Service’s Finance Department, is a continuously fielded study that measures household mail volumes, mail usage, and attitudes about the mail and advertising.

The HDS uses a two-stage survey design: Stage 1 is an interviewer-mediated household recruitment interview. Stage 2 is a self-completion mail diary [Appendix C contains the survey instruments]. The HDS uses a multi-mode approach to minimize response bias, to improve data accuracy through efficient data checking and household re-contacts, and to provide immediate telephone assistance to participants during their diary week.

Household Recruitment Interview

The household recruitment interview collects information on household and individual demographics, recall of mail sent and received, adoption and use of communications technologies, bill payment behavior, and attitudes towards advertising.

Mail Diary

The mail diary covers a seven-day period (Monday to Sunday) and collects information on the number of mail pieces received and sent, industry source, mail characteristics, and attitudes regarding mail received.

Sample Design

This section describes the household selection process for participation in the HDS. A sample is the representative subset of the survey population used to gain information about the entire population. The population of inference for the HDS is all U.S. households. The probability design ensures each household has an equal chance of selection.

The sample design allows projections of results to all U.S. households. The Postal Service provided an address sample that NuStats matched for known telephone listings. Generally, the study was conducted using telephone sampling for household selection and screening, followed by diaries mailed to eligible households and completed by each household unit. Households without telephones were contacted via U.S. Mail. The sample design involves a systematic sample stratified by strata (or urban/rural location) and Census regions, ensuring even coverage across the United States.

A master national sample was specified and drawn by in-house sampling statisticians. The Postal Service drew the household probability sample from the national address database following NuStats specifications. The master list, sorted by ZIP code, was used to draw a systematic stratified sample, which was then tagged with variables indicating each housing unit’s geographic location in terms of Census region and stratum.

Sample was drawn for each of the four quarters based on known proportions of households within a Census region and urban or rural location. Census regions are defined by state. Urban and rural location is defined by county and metropolitan status as defined by the U.S. Census Bureau. The strata are defined by county as follows:

·         Stratum 1: Counties that are part of the 30 largest metropolitan areas in the United States, as defined by population, according to 100 percent counts of the Census 2010.   

·         Stratum 2: Counties that are part of metropolitan areas but are not in Stratum 1.

·         Stratum 3: Counties that are not part of a metropolitan area.

Quarterly sample frames were then derived based on the amount of sample needed for each quarter, and sample was allocated to region and strata cells based on known proportions as indicated by Census 2010 counts of households.

The sample was continuously “fielded” throughout all 52 weeks of the year. Sample was released in a manner designed to recruit equal sample sizes for each diary week, resulting in a sample file of at least 5,200 households. Table B.1 below shows the distribution of recruited and completed households.


Table B.1:
Sample by Postal Quarter

Quarter

Required
Sample

Recruited

Households

Completed
Households

Quarter 1

1,300

2,015

1,290

Quarter 2

1,300

2,036

1,264

Quarter 3

1,300

1,946

1,161

Quarter 4

1,300

2,258

1,416

Total

5,200

8,255

5,131

 

Data Collection Method

The study uses a two-stage design in which households are recruited to participate in the diary study in a household interview (Stage 1) and recruited households complete a seven-day diary of mail received and sent (Stage 2).

Stage 1:  Household Recruitment Interview

The main function of the household recruitment interview is to recruit households to participate in the diary study. In addition, the interview collects information on household and person demographics, recall of mail sent and received, adoption and use of communication technologies, bill payment behavior, and attitudes towards advertising.

Households completed the recruitment interview via computer-assisted telephone interviewing (CATI) technology. The FY 2012 household interview consisted of 8,255 completed interviews with an adult member (age 18 or older) in the household. These respondents represented a cross-section of U.S. households by geography. The household interview contained 130 data items and took an average of 25 minutes to administer. The flow of the interview included the following elements:

·         Introduction. Each interview began with an introduction and purpose of the interview. The interviewer also verified the respondent’s address.

·         Technology adoption and use. Questions were asked about ownership and use of personal computers, Internet, and other electronic communication.

·         Mail volume recall. The respondent was asked to summarize how many personal letters, greeting cards, electronic greeting cards, and packages all members of the household have sent in a particular time period.

·         Use of postal services. The use of post offices, post office boxes, and private mailing services was explored.

·         Bill payments. Bill payment volumes, methods, and timing were explored in depth.

·         Periodicals. A summary of magazine and newspaper volumes received by the household were collected.

·         Advertising. Descriptions of advertising received by the household as well as attitudes about the advertising, and orders placed because of it, were elicited.

·         Online shopping. Respondents were asked about their online shopping habits, including questions about shipping methods.

·         Financial accounts and credit cards. Respondents were asked to summarize the total accounts and credit cards held by the household.

·         Household and person demographics. Demographic items included gender, age, marital status, employment status, educational attainment, race/ethnicity, household income, household wage earners, home ownership, residence tenure, and dwelling type.

The completion rate for the FY 2012 study (defined as the proportion of respondents who completed the diary portion relative to all recruited respondents) was 62.2 percent. This represents a decrease of 2.6 percent from 2011. Most recruitment refusals took place prior to hearing who NuStats was and why the firm was calling. Refusal households that were later re-contacted cited time constraints and privacy concerns as reasons for not participating.

Stage 2:  Mail Diary Package

Recruited households were sent mail diaries, instructions, and a toll-free “help” telephone number. The night before the beginning of an assigned diary week, NuStats made reminder calls to households to confirm receipt of the diary package and to answer any questions. If the diary package was not received by this time, NuStats re-confirmed the address, assigned a new diary week, and re-sent the diary package.

The diary package contains a Certificate of Appreciation, Instruction Booklet, and a photo-based “Quick Start” sheet. The Instruction Booklet provided information about the study, answers to frequently asked questions, instructions for filling out the diary, guidelines for sorting mail, and examples of mail markings.


The diary instrument was composed of two parts:

·         The Question sheets. The Question sheets are color-coded by mail classification (First-Class Mail received, First-Class Mail sent, Standard, Nonprofit, etc.). Information collected about each mail classification included: type of mail piece (i.e., envelope, postcard, catalog, etc.), receiver ZIP code, sender ZIP code, mail classification, mail type, sender type, information about advertising enclosed, and receiver reaction or responses to the mail piece.

·         Seven answer booklets, each specific to a day of the week. Each booklet was arranged by mail classification and color-coded to correspond to the question sheets.

Households were instructed to enclose pertinent information from each mail piece received to enable NuStats editors to verify or clarify quantity and classes of mail recorded in the diaries. NuStats used a three-stage editing process to check the accuracy of the diary information recorded by each household. First, returned diary packages were culled for those that represented a reasonable attempt to complete the diary. Second, the diary information recorded for each day were checked to ensure sufficient and logical answers, as well as to verify recorded information against the mail markings returned in the package. The diaries were then scanned using Optical Character Recognition (OCR) software. In stage three, a verifier re-checked the diary information recorded in the OCR software for each day. This second edit functions as a quality control check to ensure data accuracy.

During the editing process, a small number of correction callbacks were made to households to clarify information or to fill-in missing information. Overall, about three percent of returned diaries did not pass the edit checking process.

Of the 8,255 households recruited to receive a diary package, 5,131 actually returned acceptable completed diaries (defined as containing data suitable for analysis) to NuStats, for a completion rate of 62.2 percent.

Data Processing

Data Management

Data management entails processing the information resulting from the Household Interview and Mail Diaries, making it available for analysis, storing it, and documenting it. Household interviews were conducted using CATI technology, where the questionnaire and relevant data checks were programmed into a master questionnaire that was used by all interviewers to administer the survey. Recorded data was extracted from the CATI software into a database management file.

Returned diary information was recorded (entered) through optical scanning technology. The diary data, once scanned using Teleform software, was captured in a database management file.

After completion of data collection, editing and entry tasks, the survey data were contained in 9 data files. One data file contained the Household Interview data. The Mail Diary data were in 8 files—one for each mail classification (First-Class Mail received, First-Class Mail sent, etc.). These files were all developed in SAS-PC.

The file variables were identified by variable name. For each file variable, the File Information contains:

·         Label, which is a brief description of the variable;

·         Measurement level, which specifies the level of measurement as scale (numeric data on an interval or ratio scale), ordinal, or nominal. Nominal and ordinal data can be either string (alphanumeric) or numeric;

·         Value formats, which identify the response codes; and

·         Column width and alignment.

Several SAS programming operations were necessary to put the Mail Diary data in the desired form for analysis. The structure for these programs was contained in a separate File Information document that accompanied the data delivery.

Various edit routines were used to check the consistency of the reported data and to identify reporting or entry errors. Routine edit checks were conducted to examine questionnaire responses for reasonableness and consistency across items. Routine checks included such items as:

·         Response code range checks;

·         Checks for proper data skips and patterns of answering questions consistent with prior answers;

·         Checks for realistic responses (e.g., number of online purchases possible in one month); and

·         Checks for high frequency of item non-response (missing data from question refusals).

When conducting these checks, data were compared against the actual survey forms. NuStats identified extreme values that were impossible or unlikely, and corrected inconsistent data when possible. For example, extremely high numbers of computers owned by a household were examined to determine whether or not they were legitimate.

Some extreme/inconsistent data values unable to be corrected or verified were edited to missing values.

In addition, NuStats performed in-depth customized data checks to ensure data within each record of the Household Interview were logically consistent. For example, a respondent should have reported paying bills online only if he/she also reported having Internet access. Customized checks were also used to ensure consistency between the Household Interview and Mail Diary data. For example, an addressee was identified as a child (under 18) in the diary only if the household also reported having a child in the Household Interview.

Raw variables, derived variables, and analytical programs were documented in a data documentation binder that accompanied the data delivery. Any information that could be directly or indirectly used to identify individual respondents, such as respondent names, addresses, or telephone numbers, were removed to protect respondent confidentiality and privacy. Such information is stored in a locked archival file.


Sample Demographic Profile (All Counts Unweighted),
Government Fiscal Year
2012

Table B.2:
Annual Household Income by Recruitment/Retrieval Status

Annual
Household Income

Recruited Households

Total

Sample Percent

Population Percent

Retrieved

Not Retrieved

Under $10,000

141

257

398

3.2%

7.6%

$10,000 - $14,999

179

203

382

4.1%

5.9%

$15,000 - $19,999

218

194

412

4.9%

5.7%

$20,000 - $24,999

256

162

418

5.8%

5.9%

$25,000 - $34,999

394

220

614

8.9%

11.0%

$35,000 - $49,999

567

298

865

12.9%

13.9%

$50,000 - $64,999

616

336

952

14.0%

11.4%

$65,000 - $79,999

553

249

802

12.5%

9.0%

$80,000 - $99,999

458

227

685

10.4%

8.6%

$100,000 or more

1,026

482

1,508

23.3%

21.0%

Don't Know

156

175

331

N/A

N/A

Refused

567

321

888

N/A

N/A

Total

5,131

  3,124

8,255

100.0%

100.0%

Notes: 
Sample Percent based only on retrieved households that provided a response to the Household Income question.
Population percent based on U.S. Census Bureau, Current Population Survey Annual Demographic File (March 2012).


Table B.3:
Number of Adults in Household by Recruitment/Retrieval Status

Number of Adults
in Household

Recruited Households

Total

Sample Percent

Population Percent

Retrieved

Not Retrieved

One

1,239

894

2,133

24.1%

27.4%

Two

2,254

1,124

3,378

43.9%

33.8%

Three

694

451

1,145

13.5%

15.9%

Four

560

378

 938

10.9%

13.3%

Five or More

384

277

 661

7.5%

9.6%

Total

5,131

3,124

8,255

100.0%

100.0%

Notes:
Sample Percent based only on retrieved households.
Population percent based on U.S. Census Bureau, Current Population Survey Annual Demographic File (March 2012).

Table B.4:
Geographic Region by Recruitment/Retrieval Status

Geographic Region

Recruited Households

Total

Sample Percent

Population Percent

Retrieved

Not Retrieved

Northeast

935

596

1,531

18.2%

16.0%

Midwest

1,197

693

1,890

23.3%

23.2%

South

1,850

1,222

3,072

36.1%

38.1%

West

1,149

613

1,762

22.4%

22.6%

Total

5,131

3,124

8,255

100.0%

100.0%

Notes: 
Sample Percent based only on retrieved households.
Population percent based on U.S. Census Bureau, Census 2010, Summary File 3, Table
H6 (Occupied Housing Units).

Table B.5:
Urban/Rural Location by Recruitment/Retrieval Status

Urban/Rural Location

Recruited Households

Total

Sample Percent

Population Percent

Retrieved

Not Retrieved

30 Largest Metro Areas

2,540

1,559

4,099

49.5%

46.6%

Other Metro Areas

1,549

911

2,460

30.2%

42.4%

Non-Metropolitan Areas

1,042

654

1,696

20.3%

11.0%

Total

5,131

3,124

8,255

100.0%

100.0%

Notes:
Sample Percent based only on retrieved households.
Population percent based on U.S. Census Bureau, Census 2010; Strata based on Metro Area Classification by County.


Table B.6:
Age of Head of Household by Recruitment/Retrieval Status

Age of
Head of Household

Recruited Households

Total

Sample Percent

Population Percent

Retrieved

Not Retrieved

18 - 24

68

70

138

1.3%

5.0%

25 - 44

943

674

1,617

18.6%

34.0%

45 - 64

2,214

1,232

3,446

43.6%

38.9%

65+

1,853

1,112

2,965

36.5%

22.2%

Refused

53

36

89

N/A

N/A

Total

5,131

3,124

8,255

100.0%

100.0%

Notes: 
Sample Percent based only on retrieved households that provided a valid response.
Population percent based on U.S. Census Bureau, Current Population Survey Annual Demographic File (March 2012).

Table B.7:
Educational Attainment of Head of Household by Recruitment/Retrieval Status

Educational Attainment of
Head of Household

Recruited Households

Total

Sample Percent

Population Percent

Retrieved

Not Retrieved

8th grade or less

48

84

132

0.9%

4.3%

Some high school

143

227

370

2.8%

7.5%

High school graduate

1,102

913

2,015

21.6%

28.6%

Some college

979

591

1,570

19.2%

18.5%

Technical school graduate

254

158

412

5.0%

4.4%

College graduate

1,539

685

2,224

30.2%

25.4%

Postgraduate work

1,030

423

1,453

20.2%

11.3%

Refused

36

43

79

N/A

N/A

Total

5,131

3,124

8,255

100.0%

100.0%

Notes:
Sample Percent based only on retrieved households that provided a valid response.
Population percent based on U.S. Census Bureau, Current Population Survey Annual Demographic File (March 2012).


Data Weighting and Expansion


This section explains the methodology used for creating sampling and expansion weights for the FY 2012 Household Diary Study.

The FY 2012 HDS uses both weighting and expansion factors to 1) adjust the sample data to match population parameters and 2) expand mail volumes exhibited in the diary sample to all U.S. households.

Weighting Procedures, FY 2012 Recruitment Data

Sampling weights were produced separately for the households that participated in the recruitment phase of the FY 2012 HDS, and those that completed and returned a diary. There were two main weighting variables: Geography and Education. FY 2012 recruitment geographic weights were derived from sample households’ strata and region:

Strata: As mentioned previously, there are three strata. A household was classified within strata as residing in the top 30 metropolitan areas nationwide, any other metropolitan area, or a non-metropolitan area.[1] Table B.8 provides unweighted sample counts from FY 2012 recruitment data for strata:

Table B.8:
HDS 2012 Recruitment Data: Urban/Rural Location

Urban/
Rural Location

Household

Percent

Cumulative Percent

30 Largest
Metro Areas

4,099

49.7%

49.7%

Other Metro Areas

2,460

29.8%

79.5%

Non-Metro Counties

1,696

20.5%

100.0%

Total

8,255

100.0%

 

 


Regions: Households were classified by state. There are four mutually exclusive regions as defined by the U.S. Census Bureau (along with respective states):

Four Census Regions:

Northeast: Connecticut, Maine, Massachusetts, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, and Vermont.

Midwest: Illinois, Indiana, Iowa, Kansas, Michigan, Minnesota, Missouri, Nebraska, North Dakota, Ohio, South Dakota, and Wisconsin.

South: Alabama, Arkansas, Delaware, District of Columbia, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, and West Virginia.

West: Arizona, Alaska, California, Colorado, Hawaii, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington, and Wyoming.

Table B.9:
HDS 2012 Recruitment Data: Geographic Region

Geographic Region

Households

Percent

Cumulative Percent

Northeast

1,531

18.5%

18.5%

Midwest

1,890

22.9%

41.4%

South

3,072

37.2%

78.7%

West

1,762

21.3%

100.0%

Total

8,255

100.0%

 

 

Strata/Regions: Table B.10 indicates the distribution of households from the FY 2012 recruitment sample within strata and regions.

Population parameters for the intersection of the three strata and four regions were based on 2010 Census counts of households by county. As Table B.10 shows, each county was grouped according to its location within these 12 mutually exclusive and collectively exhaustive geographic categories.

To calculate the weight for each strata/region interval, the population percentage was divided by the sample percentage. Geography weights appear in the last column to the right in Table B.11.


Table B.10:
Distribution of Households within Strata and Region

Geographic Region

Stratum (Urban/Rural Location)

Total

30 Largest
Metro Areas

Other
Metro Areas

Non-Metro Areas

Northeast

1,096

275

160

1,531

Midwest

951

508

431

1,890

South

1,005

1,250

817

3,072

West

1,047

427

288

1,762

Total

4,099

2,460

1,696

8,255

Table B.11:
HDS 2012 Recruitment Data: Construction of Geographic Weight

Stratum

Geographic Region

Households (Population)

Percent

Households (Sample)

Percent

Weight

30 Largest Metro Areas

Northeast

8,679,534

7.96%

1,096

13.3%

.60

Midwest

11,759,871

10.79%

951

11.5%

.94

South

16,492,511

15.13%

1,005

12.2%

1.24

West

13,800,893

12.66%

1,047

12.7%

1.00

Other Metro Areas

Northeast

7,316,645

6.71%

275

3.3%

2.02

Midwest

9,982,770

9.16%

508

6.2%

1.49

South

19,849,344

18.21%

1,250

15.1%

1.20

West

9,074,069

8.33%

427

5.2%

1.61

Non-Metro Areas

Northeast

1,485,685

1.36%

160

1.9%

.70

Midwest

3,551,875

3.26%

431

5.2%

.62

South

5,200,840

4.77%

817

9.9%

.48

West

1,796,099

1.65%

288

3.5%

.47

Totals

108,990,136

100.0%

  8,255

100.0%

1.00

Source:  Household Population Estimates based on U.S. Census Bureau, 2010 Census.



Education: In addition to weighting for differences in geography between the sample and the population, an additional weight was created based on differences in the educational attainment of the head of household. For those households in which either more than one person was identified as the head of household or no individual was identified as the head of household, one was chosen based on the following sequence of criteria: 1) oldest male or 2) oldest female (if no male exists). For cases in which two candidates for the head of the household were of the same age, the respondent on the phone was chosen.

Known population parameters were based on weighted proportions derived from the U.S. Census Bureau’s Current Population Survey annual demographic file for March 2012. For cases in which the head of household refused to provide his/her education level, an educational level was imputed based on the average educational level of like cases. There were 23 such cases in 2012; mean levels of educational attainment were based on geography (strata and regions), as well as age and income level, if provided.

 


Table B.12:
HDS 2012 Recruitment Data: Construction of Educational Attainment Weight

Educational Attainment

Households (Population)

Percent

Households (Sample)

Percent

Weight

8th Grade or Less

5,211,870

4.3%

132

1.6%

2.69

Some high school

9,064,100

7.5%

370

4.5%

1.67

High school graduate

34,674,651

28.6%

2,015

24.4%

1.17

Some college

22,419,286

18.5%

1,591

19.3%

0.96

Technical school graduate

5,282,518

4.4%

468

5.7%

0.77

College graduate

30,787,154

25.4%

2,226

27.0%

0.94

Postgraduate work

13,634,627

11.3%

1,453

17.6%

0.64

Totals

121,074,207

100.0%

8,255

100.0%

1.00

Note:  Education responses include imputed Don’t Know/Refused answers.

 


Weighting Procedures, FY 2012 Diary Data

As mentioned above, 8,255 households participated in the recruitment phase of the FY 2012 HDS, and 5,131 households completed usable diaries. Balancing weights for the FY 2012 HDS diary data were developed in the same way as for the recruitment data. An additional age weight was derived based on the age of the head of household using the following categories: 18–21, 22–24, 25–34, 35–44, 45–54, 55–64, 65–69, 70–74, and over 75 years old.

Other adjustments to weights used in the diary data included a quarterly adjustment, which accounted for variances in sampling across postal quarters.
All component weights were multiplied together and normalized to ensure that the number of weighted cases equals the number of unweighted cases.

Expansion Factor

121,074,207 / 5,131 = 23,596.6

Component Weight:

,

Where Ps = population count in cohort and
Pt = total population count
Ss = sample count in cohort
St = total sample count

 

A final adjustment in the form of expansion factors was made to expand the sample to the level of total households in the United States at the time of data collection, which was 121.07 million. The number of households in the United States was divided into the number of households that participated in the diary portion of the survey. The resultant factor was applied to each household in the survey. The expansion factor was multiplied by the sampling weight and then multiplied by 52 (the number of calendar weeks in one year) to derive nationwide annual volume estimates from the sample data.
















Adjustment Factors


In order to account for variations in the reporting of household mail volumes, three types of adjustment factors were used:

1)       Destination adjustment factors;

2)       Household-to-Household adjustment factors; and

3)       Household-to-Non-household adjustment factors.

Destination adjustment factors were based on an average of historical ratios of volumes derived from FY 2012 HDS sample data and mailing volumes reported in the Postal Service’s RPW report. These destination adjustment factors were applied to First-Class Mail, Standard and Nonprofit Mail, Package and Shipping Services, and Periodicals.

Household-to-household adjustment factors were applied based on the logic that mail originating and destinating in households form a “closed loop.” In other words, mail sent by households to households should equal mail received by households from households. (This situation does not necessarily exist within the confines of a finite sample since households may receive mail from households outside the sampling frame.) Therefore, household mail sent is adjusted to equal household mail received. This factor (1.19) was applied to personal First-Class Mail.

Household-to-non-household adjustment factors were applied to account for under-reporting of mail sent by households to non-households. The use of this adjustment factor is based on a comparison between the reported bills paid by households from the recruitment phase of the survey and amounts derived from actual diary data. This factor (1.41) was applied to business First-Class Mail sent by households to non-households.

The following table indicates adjustment factors applied by postal classification.


 

Table B.13:
HDS 2012 Adjustment Factors Utilized by Postal Classification

Postal Classification

Destination Adjustment Factor

Household-to-Household

Household-to-Non-household

First-Class

0.92

1.19

1.41

Standard Regular

0.88

N/A

N/A

Standard Nonprofit

0.88

N/A

N/A

Package & Shipping Services

0.75

N/A

N/A

Expedited

0.79

N/A

N/A

Periodicals

0.83

N/A

N/A

 

 



[1] Metropolitan area is defined within the sample according to the official definition used by the U.S. Census Bureau, commonly referred to as Metropolitan Statistical Areas (MSAs). Metropolitan areas are defined as single- or multi-county areas. Non-metropolitan areas are counties that do not belong to a metropolitan area. Each sample county was assigned to a stratum according to its metropolitan status.