4.4.   Quick Start Tutorial: Create Your Second Stage INI File for Data Cleaning

This section will show you how to create your second stage INI file, in order to carry out the second stage data cleaning. The following steps assume that you have already created a first stage INI file and run first stage cleaning for your flux site.

On this page:



Instructions to create second stage INI file

  1. Open your site-specific SITEID1_SecondStage.ini for editing.

  2. Near the top of the file, insert the full name of your site (more descriptive is better) and your site ID.

  3. The TEMPLATE_SecondStage.ini file that you downloaded and copied already includes a set of fundamental climate variables and the bare minimum flux variables needed to run the third stage. We advise testing the file with only these variables first. Note that even if you do not change the content of any traces during the second stage, you must still include any trace that you wish to be available for the third stage (and beyond, e.g., Ameriflux submission), i.e., they must be passed along the pipeline.

  4. Test the second stage data cleaning in Matlab, as follows:

     fr_automated_cleaning(yearIn,'SITEID1',2)

    Note the third and final input parameter is 2 instead of 1, denoting second stage. You should see a new Clean folder directly under the site ID directory in your Database, for whichever year(s) you ran it for. Within that Clean folder, there will be a SecondStage folder (figure 4.4A).

    DirectoryTree:DatabaseSecondStage

    Figure 4.4A. Directory tree showing location of data output (in this example, variable1, variable2, …) from second stage cleaning.

  5. You can add any other relevant variables into the second stage INI file. Again, use the visualization tools to check your data looks as expected, then you are ready to progress to third stage cleaning.



Quick summary of second stage features

  1. Combine multiple traces into one using the calc_avg_trace function. These traces must already have been defined in the first stage.
  2. Use the “Evaluate” option to operate on your input traces.
Example: the TA_1_1_1 trace you defined in the first stage has some missing data and there is a nearby reliable climate station with a long record. In the first stage you also defined (and filtered etc.) data from this climate station in a variable called TA_OTHER_SOURCE. Then in the second stage, you can use calc_avg_trace within the Evaluate parameter, as follows:
Evaluate = 'TA_1_1_1 = calc_avg_trace(clean_tv,TA_1_1_1,TA_OTHER_SOURCE,-1)';

The function regresses the two variables provided and uses the resulting linear fit to fill the gap(s) in the first variable. Note that fluxes are gap-filled in stage three.