National Water-Quality Assessment (NAWQA) Project

 Go to:      NAWQA Home

Pesticide National Synthesis Project

Home Publications National Statistics Data Pesticide Use Water-Quality Benchmarks PNSP Internal

Analysis of Environmental Data with Censored Observations

Shiping Liu, Jye-Chyi Lu, Dana W. Kolpin, William Q. Meeker

|| Table 1 || Table 2 || Table 3 || Table 4 || Table 5 || Table 6 || Table 7 ||


Table 1. Statistical Summary for Concentrations above the 0.05 µg/L Analytical Reporting Limit (total samples = 589).

            Number  Mean   Std Dev  Minimum   Maximum 
Atrazine     101    0.298   0.399    0.050     2.100 
DEA          106    0.211   0.334    0.050     2.300 
DIA           32    0.218   0.230    0.050     1.170

Table 2. Sample Statistics for Explanatory Factors Used in the Statistical Analysis.

             Variable                                  Mean       Std. Dev. 
Percent of urban residential within 3.2 km (URBAN)     8.661       12.822 
Depth to top of aquifer (DEPTH) (m)                    5.488        5.089 
Median of well open interval (OPEN) (m)               26.956       25.086 
Percent of pasture within 3.2 km (PASTURE)             8.417       13.045 
Percent of forest within 3.2 km (FOREST)              10.265       10.913

       Dummy variables (yes = 1; no = 0)
Irrigation within 3.2 km (IRRID)                       0.301
Chemical facility within 3.2 km (CHEM)                 0.147
Golf course within 3.2 km (GOLF)                       0.130
Primary water use is domestic (USEHD)                  0.501
Primary water use is public supply (USEPD)             0.301
Well is unused (USEUD)                                 0.112
Aquifer class is unconsolidated (CLASSD)               0.653
Aquifer type is unconfined (TYPED)                     0.676
Feedlot within 30 m (FEED1D)                           0.073
Feedlot within 30 m-0.4 km (FEED2D)                    0.227
Feedlot within 0.4-3.2 km (FEED3D)                     0.316
Stream within 30 m (STR1D)                             0.077
Stream within 30 m-0.4 km (STR2D)                      0.397
Stream within 0.4-3.2 km (STR3D)                       0.569
Sample in July or August (SUMMER)                      0.499
Sample size                                          589

Table 3. Estimated Parameters of Censored Regression Model for Atrazine

             Parameter   Standard        Chi2 Test 
Variable     Estimate      Error     Chi2 Va   Pr >Chi 

INTERCPT     -5.308       0.520      104.048   0.000 
USED1                                          0.000 
USEHD        -0.776       0.438        3.141   0.076 
USEPD         0.692       0.436        2.513   0.113 
USEUD       - 1.079       0.571        3.578   0.059  
STR1D         1.009       0.420        5.770   0.016 
TYPED         0.779       0.286        7.436   0.006 
FOREST       -0.036       0.014        6.321   0.012 
STR2D         0.540       0.273        3.914   0.048 
URBAN       - 0.021       0.011        3.802   0.051 
SUMMER        0.447       0.248        3.247   0.072 
SCALE         1.979       0.163

1 The significance level of USEHD, USPD, USUD and USOD was determined using a Wald-type statistic. This statistic is compared to a Chi2 distribution with 1 degree of freedom. The significance level used in this case is 0.10.

Table 4. Estimated Parameters of Censored Regression Model for DEA

             Parameter   Standard        Chi2 Test
Variable     Estimate      Error     Chi2 Va   Pr >Chi 

INTERCPT     -4.842       0.530       83.385   0.000 
FOREST       -0.046       0.012       14.486   0.000 
TYPED         0.831       0.244       11.598   0.000 
USED1                                          0.035 
USEHD         0.138       0.399        0.119   0.730 
USEPD         0.795       0.398        3.995   0.046 
USEUD         0.436       0.462        0.893   0.345 
STR1D         0.753       0.327        5.310   0.021 
SUMMER        0.399       0.198        4.056   0.044 
STR3D         0.435       0.208        4.366   0.037 
CLASSD       -0.726       0.287        6.403   0.011 
OPEN         -0.003       0.002        3.682   0.055 
SCALE         1.623       0.131

1 The significance level of USEHD, USPD, USUD and USOD was determined using a Wald-type statistic. This statistic is compared to a Chi2 distribution with 1 degree of freedom. The significance level used in this case is 0.10.

Table 5. Estimated Parameters of Censored Regression Model for DIA

             Parameter   Standard        Chi2 Test
Variable     Estimate      Error       Chi2    Pr >Chi 

INTERCPT     -5.539       1.236       20.072   0.000 
USED1                                          0.027 
USEHD        -0.062       0.869        0.005   0.943 
USEPD         1.707       0.848        4.054   0.044 
USEUD         1.421       0.900        2.491   0.115 
STR1D         2.094       0.585       12.831   0.000 
OPEN         -0.027       0.009        8.549   0.004 
FOREST       -0.085       0.032        7.243   0.007 
CLASSD       -2.328       0.807        8.329   0.004 
TYPED         1.339       0.565        5.622   0.018 
STR3D         0.989       0.454        4.741   0.030 
SUMMER        0.798       0.408        3.823   0.051 
SCALE         1.972       0.300

1 The significance level of USEHD, USPD, USUD and USOD was determined using a Wald-type statistic. This statistic is compared to a Chi2 distribution with 1 degree of freedom. The significance level used in this case is 0.10.

Table 6. Statistical Summary for Concentrations (in µg/L) of Pseudo-Complete Data

Compound       Sample Size   Mean1    Std Dev1    Minimum1     Maximum1 
Atrazine Residue   589       0.119     0.358     1.388 x 10-3    4.480 
Atrazine           589       0.057     0.198     7.048 x 10-5    2.100 
DEA                589       0.046     0.161     1.274 x 10-4    2.300 
DIA                589       0.016     0.072     9.576 x 10-9    1.170  

1 The values are averaged across five imputations. The variations among different imputations for each statistics are limited, indicating stability in our models.

table 7. Estimated Parameters of Final Regression Model for Atrazine Residue

        Paramenter Estimate     Significant At1
Variable      Sign                *: at 0.05
                                 **: at 0.10

OPEN            -                   * 
URBAN           -
FOREST          -                   * 
CLASSD          -                  ** 
TYPED           +                   * 
STR1D           +                   * 
STR3D           +                   * 
SUMMER          +                  ** 
IRRID           +                   * 
USEPD           +                   *  

1The variables selected and their associated significance levels are based on the pseudo-complete data set. Because the likelihood is very complicated, it is impractical to use the large sample normal approximation or finite sample simulation to obtain the exact significance levels. Although the significance levels reported from SAS output as given in this table are not exact due to the imputation of censored data, it is practical for most readers that typical regression procedures can be used to analyze censored data in the type of example addressed in this paper. Moreover, all five pseudo-complete data sets resulted in the same variables selected and their significance levels were all similar. Thus, the proposed procedure is reasonable to provide practical solutions to the problem addressed in this demonstration.

Back to: Index
For questions or comments about this document,
please contact Dana W. Kolpin,

Accessibility FOIA Privacy Policies and Notices

Take Pride in America home page. FirstGov button U.S. Department of the Interior | U.S. Geological Survey
Page Contact Information:
Page Last Modified: Tuesday, 04-Mar-2014 14:44:25 EST