Online Documentation Source for the FED System

Importing the IMPROVE Regional Haze Rule Metrics II dataset

  1. Prepare a file system location for storing the source data
    1. Using Windows Explorer, browse to the \\VADER\SourceData\IMPROVE\RHR_And_Impairment\RHR2 folder.
    2. Create a date-stamped subfolder to contain the data files being downloaded, using the format "R<YYYYMMDD>" to name the folder. For example, an "R20170814" folder would be created for data obtained on August 14, 2017. The "R" stands for "Received" or "Receipt". A subfolder named "Current" can also be used to contain the most recently downloaded file set regardless of the specific date.
  2. Acquire the updated RHR Metrics source data
    1. Visit Scott Copeland's RHR2 Google Drive folder and select all the files listed
    2. Right-click on the selected files and choose "Download" from the Google Drive context menu. The files will be automatically zipped into a single *.zip file and downloaded to your browser's default download folder.
    3. Copy the downloaded *.zip file to the file system folder created above
    4. Right-click on the copied *.zip file and select "Extract to here..." (or similar, depending on your un-zip utility).
  3. Import the Daily Budgets dataset
    USE AIRDATA_IMPORT
    GO
    
    EXEC bspImportDatasetItem
    	@SourceName = 'INA_IMPAIRMENT_DAILY_BUDGETS_ALL_R20170830',
    	@SourceFormat = 'CSV File',
    	@SourceLastModifiedDate = '8/30/2017',
    	@SourceAcquisitionDate = '9/15/2017',
    	@SourceDataUri = 'https://drive.google.com/drive/folders/0Bxfj1vyyXeDYWVpfeUo4NEYtTU0',
    	@LocalDataUri = '\\VADER\SourceData\IMPROVE\RHR_And_Impairment\RHR3_Impairment\R20170915\sia_impairment_daily_budgets_8_17.csv',
    	@DatasetItemName ='INA_IMPAIRMENT_DAILY_BUDGETS_ALL_R20170830',
    	@DatasetID = 10006,
    	@ProtocolID = 1006,
    	@ImportedBy = 'SEM',
    	@ImportLog = '\\JARVIS\E\DevSystems\EDMF\Edmf.Database\Data Import\IMPROVE\Impairment_Copeland\_ImportLogs\bssImportMaster_IMPIMP_All_Years_Initial_20170921.sql',
    	@Comments = 'Re-import of the updated impairment dataset after Scott Copeland''s revisions; See his email of Thu 8/31/2017 10:22 AM',
    	@ArgKeys = '@SourceFile',
    	@ArgValues = 'E:\SourceData\IMPROVE\RHR_And_Impairment\RHR3_Impairment\R20170915\sia_impairment_daily_budgets_8_17.csv',
    	@ShowProgress = 1,
    	@AbortOnSiteMappingError = 1,
    	@AbortOnParameterMappingError = 1,
    	@AbortOnMethodMappingError = 1,
    	@AbortOnFlagMappingError = 1,
    	@KeepSource = 1,
    	@KeepTempTables = 1,
    	@UpdateStatistics = 1,
    	@OutputTableName ='AirFact_INA_IMPAIRMENT_DAILY_BUDGETS_ALL_R20170830'
    • The above bspImportDatasetItem() procedure serves a sort of "dispatch" function whereby it selects (based upon the ProtocolID argument) a protocol-specific subprocedure to invoke.
    • The subprocedure invoked in this case is bspImport_INA_RHR_DAILY(), which is specific to the format, schema, and content of the IMPROVE Regional Haze Rule Metrics Daily Budgets dataset.
    • The output of the protocol-specific procedure is a table named "AirFact_<Dataset_Item_Name>"that is in the FED integrated fact table format and can be transferred directly to the production fact table.
  4. Import the Group Means dataset
    USE AIRDATA_IMPORT
    GO
    
    EXEC bspImportDatasetItem
    	@SourceName = 'INA_IMPAIRMENT_GROUP_MEANS_ALL_R20170830',
    	@SourceFormat = 'CSV File',
    	@SourceLastModifiedDate = '8/30/2017',
    	@SourceAcquisitionDate = '9/15/2017',
    	@SourceDataUri = 'https://drive.google.com/drive/folders/0Bxfj1vyyXeDYWVpfeUo4NEYtTU0',
    	@LocalDataUri = '\\VADER\SourceData\IMPROVE\RHR_And_Impairment\RHR3_Impairment\R20170915\sia_impairment_group_means_8_17.csv',
    	@DatasetItemName ='INA_IMPAIRMENT_GROUP_MEANS_ALL_R20170830',
    	@DatasetID = 10006,
    	@ProtocolID = 1007,
    	@ImportedBy = 'SEM',
    	@ImportLog = '\\JARVIS\E\DevSystems\EDMF\Edmf.Database\Data Import\IMPROVE\Impairment_Copeland\_ImportLogs\bssImportMaster_IMPIMP_All_Years_Initial_20170921.sql',
    	@Comments = 'Re-import of the updated impairment dataset after revisions by Scott Copeland; See his email of Thu 8/31/2017 10:22 AM',
    	@ArgKeys = '@SourceFile',
    	@ArgValues = 'E:\SourceData\IMPROVE\RHR_And_Impairment\RHR3_Impairment\R20170915\sia_impairment_group_means_8_17.csv',
    	@ShowProgress = 1,
    	@AbortOnSiteMappingError = 1,
    	@AbortOnParameterMappingError = 1,
    	@AbortOnMethodMappingError = 1,
    	@AbortOnFlagMappingError = 1,
    	@KeepSource = 1,
    	@KeepTempTables = 1,
    	@UpdateStatistics = 1,
    	@OutputTableName ='AirFact_INA_IMPAIRMENT_GROUP_MEANS_ALL_R20170830'
    The bspImportDatasetItem() procedure performs actions that are common to all datasets and in turn executes another import procedure bspImport_INA_RHR_GROUPS() that is specific to the IMPROVE Regional Haze Rule Metrics Group Means dataset. The output of this step is a table named "AirFact_<DatasetName>" that is in the FED integrated fact table format and can be transferred directly to the production fact table.
    • The above bspImportDatasetItem() procedure serves a sort of "dispatch" function whereby it selects (based upon the ProtocolID argument) a protocol-specific subprocedure to invoke.
    • The subprocedure invoked in this case is bspImport_INA_RHR_GROUPS(), which is specific to the format, schema, and content of the IMPROVE Regional Haze Rule Metrics Daily Budgets dataset.
    • The output of the protocol-specific procedure is a table named "AirFact_<Dataset_Item_Name>"that is in the FED integrated fact table format and can be transferred directly to the production fact table.
  5. Delete from the production AirFact table the data (if any) that is being replaced.
    DELETE FROM AIRDATA_CORE.dbo.AirFact WHERE DatasetID = 10006 -- took 1:36:48 hours; (30642942 row(s) affected)
    
    -- ^^ The above command is an example of replacing the *entire* RHR2 dataset.

    In many (most) cases, new RHR2 data will simply be added to the production AirFact table, and no DELETE command will be necessary. But when updated data is intended to replace previous data, a date-range-specific DELETE command will be needed to do a more precision replacement.

    Below is an example of deleting a single year of previous data:

    DELETE FROM AIRDATA_CORE.dbo.AirFact WHERE DatasetID = 10006 AND YEAR(FactDate) = 2016

    And here is an example of deleting three months of data:

    DELETE FROM AirFact WHERE DatasetID = 10006 AND ((FactDate >= '6/1/2016') AND (FactDate < '9/1/2016')) -- Deleting 3 months of data (June, July, and August)
  6. Transfer the newly-imported data tables to the production AirFact table by following the instructions from the Exporting a table in AirFact format to the production AirFact table topic.
  7. Update the relevant metadata "link" tables in the production database.
    USE AIRDATA_CORE
    GO 
    
    EXEC bspUpdate_Dataset_Site @DatasetID=10006 -- took 9:31 minutes
    
    --Done deleting previous Dataset_Site records for DatasetID 10006 (195 records deleted); Elapsed time: 0 minutes;
    --Done inserting new Dataset_Site records for DatasetID 10006 (163 records inserted); Elapsed time: 9 minutes;
    
    EXEC bspUpdate_Dataset_Parameter @DatasetID=10006 -- took 2:32 minutes
    
    --Done deleting previous Dataset_Parameter records for DatasetID 10006 (102 records deleted); Elapsed time: 0 minutes;
    --Done inserting new Dataset_Parameter records for DatasetID 10006 (104 records inserted); Elapsed time: 2 minutes;
    
    EXEC bspUpdate_Dataset_Aggregation @DatasetID=10006 -- took 40 seconds
    
    --Done deleting previous Dataset_Aggregation records for DatasetID 10006 (7 records deleted); Elapsed time: 0 minutes;
    --Done inserting new Dataset_Aggregation records for DatasetID 10006 (7 records inserted); Elapsed time: 1 minutes;