WRAP
IMPROVE Data Substitutions
(June
2011)
To track progress under the EPA’s Regional Haze Rule (RHR), states and tribes use speciated aerosol measurements collected by the Interagency Monitoring of Protected Visual Environments (IMPROVE) program. RHR guidance outlines data completeness requirements designed to balance the need for data from individual days, seasons, and years to be reasonably representative of ambient aerosol concentrations at each monitoring site. For sites with incomplete data during the baseline years (fewer than 3 complete years), appropriate tracking metrics cannot be calculated. The WRAP, working with individual states, developed additional data substitution methods for sites that did not have the required baseline data. These methods were also applied at sites where incomplete years were desirable for modeling and planning purposes. Additional substitutions included estimating missing species from other on-site measurements and appropriately scaling data collected from nearby donor sites which showed favorable long-term comparisons.
RHR
Requirements
Regional
Haze Rule (RHR) guidance outlines IMPROVE aerosol data completeness
requirements including the
following conditions:
Ø
Individual samples
must contain all species required for the calculation of light extinction
(sulfate, nitrate, organic carbon, elemental carbon, soil, coarse mass, and,
for the new IMPROVE algorithm, chloride or chlorine).
Ø
Individual seasons
must contain at least 50% of all possible daily samples.
Ø
Individual years must
contain at least 75% of all possible daily samples.
Ø
Individual years must
not contain more than 10 consecutive missing daily samples.
Ø
The baseline period (2000-2004)
must contain at least 3 complete years of data.
RHR guideline also provides two methods to fill in missing data under specific circumstances. These methods are routinely applied to IMPROVE data and include:
Ø
The use of a surrogate
in the data set:
§ Total sulfate is generally determined as 3 times the sulfur
measured on the A module filter. If sulfur is missing, the sulfur
measurement from the B module filter is used to calculate sulfate.
§ For the new IMPROVE algorithm, sea salt is calculated from
chloride measured on the B module filter. If chloride is missing or below
detection limit, the chlorine measurement from the A module filter is used to
calculate sea salt.
Ø
The application of
“patching” missing data described by the RHR guidance:
§ Missing samples not substituted using a surrogate as
described above can be patched, or replaced, by a seasonal average if the
patching exercise passes a series of tests outlined in the guidance document.
Once these
methods have been applied to the data, the resulting complete years are
eligible for use in calculation of the baseline conditions and tracking
progress under the Regional Haze Rule. Further details on these requirements can be found in the RHR guidance
document for tracking progress: http://www.epa.gov/ttn/oarpg/t1/memoranda/rh_tpurhr_gd.pdf.
Additional
Data Substitution Methods
After RHR prescribed data substitutions were made,
some IMPROVE monitoring sites still failed to meet the RHR data completeness
requirements for the 2000-2004 baseline period. Additionally, some sites that
met the RHR requirements were missing years that were desirable for planning
and modeling purposes. In particular, a complete year of data for 2002 was
required because that was the year selected for regional modeling and used to
predict visibility metrics in 2018.
The WRAP, in consultation with individual states,
developed additional data substitution methods to the desired years of data at
ten (10) WRAP sites. The starting data set was the RHR IMPROVE data using the
“Revised IMPROVE Algorithm,” updated March 2006, (http://vista.cira.colostate.edu/views/Web/IMPROVE/SummaryData.aspx).
This data set includes the routine surrogate and patched data substitutions
allowed by RHR guidance. Only years deemed incomplete under RHR guidance were
candidates for additional data substitutions. Years deemed complete were not
changed, even though there may have been missing samples during those years.
The first of the additional substitution methods used organic hydrogen as a surrogate for organic carbon, and resultant organic carbon as a surrogate for elemental carbon. If the carbon data substitution was not sufficient to complete the required years, measured mass for individual species from nearby IMPROVE sites with favorable long-term comparisons were scaled appropriately and used as surrogates. IMPROVE donor sites were selected in consultation with individual states. All substitutions were made using quarterly specific Kendall-Theil linear regression statistics. These statistics were chosen because they are more resistant to outliers than standard linear least squares statistics.
Figure 1 presents a flow chart of the WRAP data substitution methods. These methods are described in detail below.
Figure 1. Flow chart of data substitution methods used.
Carbon Substitutions
The first substitution method relied on using a surrogate for carbon mass measurements when the C module data is not available. Hydrogen (H) is measured on the A module filter, and is assumed to be primarily associated with organic carbon and inorganic compounds such as ammonium sulfate. Therefore, organic carbon (OC) can be estimated using the historical comparison between estimated organic H and OC. Organic H is estimated by subtracting the portion of H that is assumed to be associated with the inorganic compounds from the total H (Org_H = H – 0.24×S).
Figure 2
presents a sample comparison for data collected at the
Figure 2. Comparison of OC and estimated organic H, and EC and OC at Tonto National Monument, AZ, using second quarter raw OC and organic H data, 2000-2004.
Donor Site Substitutions
In the WRAP region, the carbon data substitution methods were not sufficient to complete the required years. A second method involved identification of another nearby IMPROVE site which had favorable long-term comparisons and similar regional characteristics to be used as a donor site. Candidate sites were identified, and final donor sites for surrogate mass were selected in consultation with states.
Figure 3
presents a sample inter-site mass comparison by species for data collected
during the second quarter, 2000-2004, between the
Figure 3. Comparison of aerosol species mass between
Data Completeness Following
Substitutions
The years at each site requiring some degree of substitution are presented in Table 1, where a “2” in one of the year fields indicates a substituted year, a “1” indicates the year was already complete under RHR guidelines, and dashes indicate the year did not meet RHR guidelines and no additional substitutions were made. The table also lists sites that were selected as donor sites. The minimum data requirement of three complete years was met for each site, and additional substitutions beyond these requirements were made when deemed appropriate by individual states.
Table 2 presents each site with the number of days substituted per year, with a percentage breakdown by method and species. The carbon substitution method was not sufficient to complete years at any of these sites, so the donor site method was also applied. Initially complete years were not changed, even though there may have been missing samples during those years. Multiple factors contributed to missing data at these sites, including sampler installation late in the baseline period, the clogging of some modules (especially during fire events), and various equipment failures. In some cases, the bulk of individual species were available at sites, and substitution for only minor components were required to complete individual days.
Figures 4 and 5 present bar charts representing substituted data for the
Availability
and Archival of Data Sets
These data
have been integrated into the WRAP Web-based Technical Support System (
Table 1
Sites and Years Where Additional Data Substitutions Were Applied
State |
Site |
Donor Site |
2000 |
2001 |
2002 |
2003 |
2004 |
AZ |
BALD1 |
TONT1 |
-- |
2 |
2 |
1 |
1 |
TONT1* |
SIAN1 |
-- |
1 |
2 |
1 |
1 |
|
CA |
KAIS1 |
YOSE1 |
-- |
-- |
2 |
1 |
1 |
RAFA1 |
PINN1 |
2 |
2 |
2 |
1 |
1 |
|
SEQU1* |
DOME1 |
1 |
1 |
2 |
2 |
1 |
|
TRIN1* |
LAVO1 |
-- |
1 |
2 |
1 |
1 |
|
MT |
GLAC1* |
FLAT1 |
1 |
1 |
2 |
2 |
1 |
ND |
THRO* |
MELA1 |
2 |
1 |
1 |
1 |
1 |
UT |
CAPI1 |
CANY1 |
2 |
2 |
2 |
1 |
1 |
WA |
NOCA1 |
SNPA1 |
-- |
1 |
1 |
2 |
2 |
-- indicates an incomplete year with no substitutions
made
1 indicates a
complete RHR year
2 indicates a
year is considered complete with some substituted values
* Sufficient RHR baseline data, but additional years
were substituted for planning and modeling
purposes.
Table 2
Number of Days Substituted, and Percent Substituted Days by Method and by Species
State |
Site |
Year |
# Days Sub. |
Carbon Subs. |
Donor Site Substitutions |
||||||
OC/EC |
Amm. SO4 |
Amm. NO3 |
OC/EC |
Soil |
CM |
Sea Salt |
|||||
AZ |
BALD1 |
2001 |
25 |
4% |
92% |
92% |
92% |
92% |
96% |
92% |
|
2002 |
21 |
57% |
95% |
95% |
38% |
100% |
100% |
100% |
|||
TONT1 |
2002 |
14 |
93% |
-- |
-- |
-- |
7% |
57% |
7% |
||
CA |
KAIS1 |
2002 |
33 |
-- |
91% |
91% |
94% |
97% |
97% |
97% |
|
RAFA1 |
2000 |
28 |
-- |
86% |
86% |
86% |
100% |
100% |
100% |
||
2001 |
33 |
-- |
88% |
94% |
88% |
85% |
91% |
85% |
|||
2002 |
21 |
-- |
76% |
76% |
76% |
86% |
100% |
86% |
|||
SEQU1 |
2002 |
17 |
-- |
100% |
100% |
100% |
71% |
71% |
71% |
||
2003 |
35 |
20% |
69% |
69% |
69% |
66% |
94% |
66% |
|||
TRIN1 |
2002 |
30 |
3% |
67% |
83% |
67% |
80% |
83% |
80% |
||
MT |
GLAC1 |
2002 |
21 |
-- |
38% |
95% |
29% |
38% |
43% |
38% |
|
2003 |
18 |
-- |
61% |
67% |
44% |
78% |
94% |
78% |
|||
ND |
THRO |
2002 |
12 |
17% |
83% |
83% |
83% |
83% |
83% |
83% |
|
UT |
CAPI1 |
2000 |
36 |
-- |
100% |
94% |
97% |
100% |
100% |
94% |
|
2001 |
60 |
-- |
80% |
80% |
80% |
82% |
97% |
82% |
|||
2002 |
32 |
-- |
100% |
100% |
100% |
100% |
84% |
100% |
|||
WA |
NOCA1 |
2003 |
30 |
-- |
80% |
80% |
77% |
100% |
100% |
100% |
|
2004 |
33 |
-- |
82% |
82% |
94% |
100% |
94% |
100% |
|||
Figure 4. 2002 annual bar chart for RAFA1 site indicating substituted data in species-specific colors, and original RHR data in blue.
Figure 5. 2002 annual bar chart for RAFA1 site indicating full speciation of RHR data combined with substituted data.