6.3. Quick Start: Third Stage Cleaning and Converting to Ameriflux Output
Third Stage Cleaning
The third stage cleaning generally requires the least amount of work by the user, but is usually the most computationally intensive stage, as it includes running models for gap-filling fluxes and flux partitioning. The following example assumes you have already completed first and second stage cleaning for one site.
Open your site-specific
SITEID1_config.yml
for editing (figure 6.5):Figure 6.5. Directory tree showing location of third stage custom YAML file that must be copied (green highlighted text) and edited (yellow highlighted text).
At the top of your site-specific configuration file (i.e.,
SITEID1_config.yml
), input the site ID, the year that measurements at the site began, and the metadata for your site (figure 6.6; yellow highlighted text):Figure 6.6. Third stage site-specific custom YAML file showing which fields to edit in yellow highlighted text.
The main configuration file (
global_config.yml
) for running third stage cleaning is located in theTraceAnalysis_ini
directory, and generally speaking this should not be edited. The customSITEID1_config.yml
file can be used to add parameters/inputs; these site-specific settings will overwrite those in theglobal_config.yml
if they are also defined there.Note on gap-filling FCH4: currently the predictors for all random forest models used to fill gaps are set to:
Predictors: SW_IN_1_1_1,TA_1_1_1,VPD_1_1_1
. However, for FCH4, these inputs should be changed to prioritize soil variables such as soil temperature, soil moisture, and water table depth. You can change these settings under “Optional parameters” (figure 6.7; peach highlighting).Figure 6.7. Third stage site-specific custom YAML file showing where to change inputs for FCH4 random forest gap-filling.
Next, test the third stage data cleaning in Matlab; remember that it can take a lot longer to run than first and second stages. Note that the cleaning stage argument for third stage cleaning is
7
(not 3; this is a legacy artifact), as follows:fr_automated_cleaning(yearIn,siteID,7) % third stage
The output will appear in two directories:
ThirdStage
andThirdStage_Default_Ustar
within theClean
directory, where the second stage output is; again, we recommend that you inspect this data using the visualization tools.
Third stage output: flux variable definitions
The standalone flux variable names (e.g., FCH4, FC, H, LE) are copied directly from the second stage output. For the variable names with suffixes following the flux variables, these suffixes represent different algorithms that we have applied sequentially, in the order that they appear. For now, this description provides only definitions, and more detailed information on each output variable will be provided on this webpage soon.
Suffix | Definition |
---|---|
_PI | Wind sector and precipitation filter |
_PI_SC | Plus Storage flux correction |
_PI_SC_JSZ | Plus z-score filter |
_PI_SC_JSZ_MAD | Plus Median of Absolute Deviation (about the median) filter |
_PI_SC_JSZ_MAD_RP | Plus REddyProc applied (u-star filtering) |
This link provides descriptions of suffixes applied to REddyProc output, e.g., _uStar
, U95
, _orig
, and _f
.
Output your data to an Ameriflux CSV file
Finally, once you have inspected your clean data and are happy with your INI files, you can output the data to a CSV file formatted for submission to Ameriflux:fr_automated_cleaning(yearIn,SITEID,8) % Ameriflux CSV output
The output will appear in an Ameriflux
directory within the Clean
directory, where the second and third stage output is.
fr_automated_cleaning(yearIn,SITEID,[1 2 7 8]) % all three stages plus Ameriflux output