SutraObsExtractor

  • Scroll to top of this topic Scroll to Top of Page

  • Print the current topic Print Topic

  • No expanding elements on this page Show/Hide Expanders

SutraObsExtractor is a program for extracting simulated values from SUTRA output files at particular locations and times and printing them in a simple format. It also can create an instruction file for either PEST or UCODE. Together, these two functions can simplify the usage of PEST or UCODE with SUTRA models.

The documentation for SUTRA version 2 and 3 identifies several different types of output files. SutraObsExtractor can extract simulated values from the following file types.

‘OBC’ = “.obc” output file (observations)

‘BCOF’ = “.bcof” output file (specifications and results at fluid-source/sink nodes)

‘BCOP’ = “.bcop” output file (specifications and results at specified-pressure nodes)

‘BCOU’ = “.bcou” output file (specifications and results at specified-concentration/temperature nodes)

‘BCOPG’ = “.bcopg” output file (specifications and results at generalized-flow nodes)

‘BCOUG’ = “.bcoug” output file (specifications and results at generalized-transport nodes)

‘LKST’ = “.lkst” output file (lake stages)

SutraObsExtractor is run from the command line. The name of an input file must be supplied on the command line. There are three ways to supply the name of the input file.

SutraObsExtractor -f <filename>

SutraObsExtractor --file <filename>

SutraObsExtractor <filename>

where "<filename>" is the name of the file. Any file names that include whitespace must be enclosed in single or double quotation marks. Single or double quotation marks around other file names are optional.

The input file must contain several blocks. Each block begins with "BEGIN" followed by a keyword and ends with "END" followed by the same keyword. Keywords are case insensitive. However, in these instructions, keywords are always written in UPPER CASE letters. The keywords that identify sections are "OPTIONS", "OBSERVATION_FILES", "IDENTIFIERS", and "DERIVED_OBSERVATIONS". Any line that is empty or contains only whitespace characters is ignored. Any line whose first non-whitespace character is "#" is treated as a comment. Whitespace characters at the beginning of a line are ignored.

OPTIONS Block:

Purpose:

The OPTIONS block is used for specifying the names of output files from SutraObsExtractor.

Structure:

BEGIN OPTIONS

[LISTING <filename>]

[INSTRUCTION <filename> [<instruction_file_type>]]

[VALUES <filename>]

END OPTIONS

Explanations:

Each non-blank and non-comment line in the OPTIONS section must begin with one of the following keywords: "LISTING", "VALUES", or "INSTRUCTION". Each of these keywords must be followed by a file name. The INSTRUCTION file name may be optionally followed by either "UCODE" or "PEST".

The LISTING file is optional. If specified, it contains a record of the steps taken during execution of SutraObsExtractor. It will end either with a line indicating that it terminated normally or with an error message. The LISTING file is useful for identifying errors in the input.

Either a VALUES or INSTRUCTION file is required. Typically, only one or the other is specified, but both may be specified in the same input file.

The VALUES file will contain the simulated values extracted from the SUTRA output file or files. Each line in the VALUES file will contain an observation name followed by the simulated value associated with that name.

The INSTRUCTION file contains instructions for either UCODE or PEST to extract simulated values from the VALUES file. The desired format may be specified with <instruction_file_type>.

<filename> is the name of a file. If the file name contains whitespace characters, the file name must be surrounded by double quotes.

<instruction_file_type> must be either "UCODE" or "PEST". It indicates whether the instruction file is designed to be used with the UCODE or PEST parameter estimation programs. If <instruction_file_type> is not specified, it defaults to PEST.

Examples:

BEGIN OPTIONS

     LISTING C:\ModelingTools\SutraObsExtractor\tests\SutraLake2.soeOut

     INSTRUCTION C:\ModelingTools\SutraObsExtractor\tests\SutraLake2.soeIns.txt

END OPTIONS

 

BEGIN OPTIONS

     LISTING SutraLake2.soeList

     VALUES SutraLake2.soeValues

END OPTIONS

OBSERVATION_FILES Block:

Purpose:

The OBSERVATION_FILES section is to identify the files from which simulated values will be extracted.

Structure:

BEGIN OBSERVATION_FILES

FILENAME <filename> <file type>

[FILENAME <filename> <file type>]

[FILENAME <filename> <file type>]

...

[FILENAME <filename> <file type>]

END OBSERVATION_FILES

Explanations:

Each non-blank, non-comment line in the OBSERVATION_FILES group must begin with the keyword FILENAME followed by the name of the file and the file type.

<filename> is the name of a file generated by SUTRA that contains information about simulated values that can be compared with observations. If the file name contains whitespace characters, the file name must be surrounded by double quotes.

<file type> is a keyword indicating the type of SUTRA output file that will be read. <file type> must be one of the following OBC, LKST, BCOP, BCOF, BCOU, BCOPG, or BCOUG. For more information about these file types, see the documentation for SUTRA version 2.2.

Example:

BEGIN OBSERVATION_FILES

     FILENAME SutraLake2_Object9.obc OBC

     FILENAME SutraLake2_Object16.obc OBC

     # this is a comment.

     # Note that there is more than one OBC file is included in the OBSERVATION_FILES 

     # section. 

     # This is the only file type supported by SutraObsExtractor for which SUTRA will create 

     # more 

     # than one file for the same model.

     FILENAME SutraLake2.lkst LKST

     FILENAME SutraLake2.bcop BCOP

     FILENAME SutraLake2.bcof BCOF

     FILENAME SutraLake2.bcou BCOU

     FILENAME SutraLake2.bcopg BCOPG

     FILENAME SutraLake2.bcoug BCOUG

END OBSERVATION_FILES

IDENTIFIERS Block:

Purpose:

The IDENTIFIERS section is used to identify simulated values to be extracted from the SUTRA output files corresponding to user-specified times. These values may either represent values that should be directly compared to observed values or they may be combined with other extracted values using the methods available in the DERIVED_OBSERVATIONS section.

Structure:

BEGIN IDENTIFIERS

ID <identifier> <observation_type> [<secondary_identifier>]

OBSNAME <Observation_name>  <observation_time>  [PRINT]

OBSNAME <Observation_name>  <observation_time>  [PRINT]

...

OBSNAME <Observation_name>  <observation_time>  [PRINT]

ID <identifier> <observation_type> [<secondary_identifier>]

OBSNAME <Observation_name>  <observation_time>  [PRINT]

OBSNAME <Observation_name>  <observation_time>  [PRINT]

...

OBSNAME <Observation_name>  <observation_time>  [PRINT]

END IDENTIFIERS

Explanations:

ID is a keyword used to indicated that the following values on the line will be used to identify a particular time series from which values are to be extracted.

<identifier> is a value in the output file that identifies a particular time series. The nature of <identifier> varies depending on the type of file from which the simulated value is to be extracted.

For OBC files, <identifier> is the "Name" listed in the OBC file.

For LKST files, <identifier> is the node number listed in the LKST file.

For all other files, <identifier> is the sequence number in which the values appear in the file for each time step for which values are recorded.

<observation_type> is used to identify the type of data to be extracted. The allowed values of <observation_type> depend on the type of file from which values are to be extracted:

P: Pressure in an OBC file.

U: Temperature or concentration in an OBC file.

S: Saturation in an OBC file.

LKST: Lake stage in an LKST file.

PF: Resultant source/sink(+/-) of fluid in a BCOP file.

PU: Solute conc/temperature of fluid source/sink in a BCOP file.

PR: Resultant source/sink(+/-) of mass/energy in a BCOP file

FF: Specified flow rate in a BCOF file

FU: Solute conc/temperature of fluid source/sink in a BCOF file

FR: Resultant source/sink(+/-) of mass/energy in a BCOF file

UR: Resultant source/sink(+/-) of mass/energy in a BCOU file

PGF: Resultant source/sink(+/-) of fluid in a BCOPG file

PGU: Solute conc/temperature of fluid source/sink in a BCOPG file

PGR: Resultant source/sink(+/-) of mass/energy in a BCOPG file

UGR: Resultant source/sink(+/-) of mass/energy in a BCOUG file

UGU: Computed conc/temperature in a BCOUG file

<secondary_identifier> is a second value on a line that helps identify a particular time series. It is required for observations in BCOP, BCOF, BCOU, BCOPG, and BCOUG files. It must not be included for observations in OBC and LKST files. For the file types that require it, <secondary_identifier> must be the node number. Note that the same node number may be repeated more than once for a single time step if the same node is specified more than once for the appropriate boundary condition in the SUTRA input file. However, it is not clear that SUTRA will handle specified pressure, specified flows, or specified concentration or temperature boundary conditions that are specified more than once at the same node. It appears that generalized-flow and generalized-transport boundaries are handled appropriately if specified more than once for the same node in the input file. <secondary_identifier> serves as a check that <identifier> has been specified correctly.

After each ID line, there must be one or more OBSNAME lines. Each such line specifies an observation name and a time at which a simulated value is desired. The value will be from the time series identified in the ID line.

OBSNAME is a keyword indicating that the line will specify an observation name and time.

<Observation_name> is the name of the observation. <Observation_name> must start with a letter or the underscore character. The remaining characters in <Observation_name> must be letters, digits, or the underscore character. All observation names must be unique. SutraObsExtractor does not limit the length of observation names.

<observation_time> is a real number that indicates the time at which the simulated value is desired. If the specified time is not included in the output file, SutraObsExtractor will interpolate to the time in question from the values recorded for the preceding and following times. If the <observation_time> is before the first recorded time, it will be ignored. If it is after the last recorded time, the value for the last recorded time will be used.

PRINT is an optional keyword. If included, the <Observation_name> and simulated value will be printed to the extracted values file or instructions for reading the <Observation_name> and simulated value will be written to the instruction file. Printing the name and value implies that they will be used directly by PEST or UCODE. If the values are not printed, they may still be used in the DERIVED_OBSERVATIONS section. Regardless of whether PRINT is present or not, the name and simulated value will be written to the listing file. SutraObsExtractor does not limit the length of observation names, but to be used by PEST or UCODE, the observation name must conform to the requirements of those programs.

Example:

BEGIN IDENTIFIERS

     ID Object9 P

          OBSNAME Test1_P  1.000000000000E+006  PRINT

     ID Object9 P

          OBSNAME Test2_P  2.000000000000E+006  PRINT

     ID Object16 U

          OBSNAME ConcOb1_U  1.000000000000E+006  PRINT

     ID Object16 U

          OBSNAME ConcOb2_U  2.000000000000E+006  PRINT

     ID 5435 LKST

          OBSNAME lakeobs  1.600000000000E+008  PRINT

# This is an example of a comment because it starts with "#".

# Note that the simulated values for most of these observations are not printed to

# the values output file but instead are used in calculations in the

# DERIVED_OBSERVATIONS section.    

     ID 367 PF  4027

          OBSNAME PF367_4027_0  6.048000000000E+005 

          OBSNAME PF367_4027_1  5.866600000000E+007 

     ID 367 PR  4027

          OBSNAME PR367_4027_0  6.048000000000E+005 

          OBSNAME PR367_4027_1  5.866600000000E+007 

     ID 387 PF  4247

          OBSNAME PF387_4247_0  6.048000000000E+005 

          OBSNAME PF387_4247_1  5.866600000000E+007 

     ID 387 PR  4247

          OBSNAME PR387_4247_0  6.048000000000E+005 

          OBSNAME PR387_4247_1  5.866600000000E+007 

     ID 388 PF  4258

          OBSNAME PF388_4258_0  6.048000000000E+005 

          OBSNAME PF388_4258_1  5.866600000000E+007 

     ID 388 PR  4258

          OBSNAME PR388_4258_0  6.048000000000E+005 

          OBSNAME PR388_4258_1  5.866600000000E+007 

     ID 1 FF  4731

          OBSNAME FF1_4731_0  1.191500000000E+008 

     ID 1 FR  4731

          OBSNAME FR1_4731_0  1.191500000000E+008 

     ID 1 UR  4552

          OBSNAME UR1_4552_0  1.191500000000E+008 

          OBSNAME UR1_4552_1  2.395000000000E+008 

     ID 2 UR  4553

          OBSNAME UR2_4553_0  1.191500000000E+008 

          OBSNAME UR2_4553_1  2.395000000000E+008 

     ID 3 UR  4554

          OBSNAME UR3_4554_0  1.191500000000E+008 

          OBSNAME UR3_4554_1  2.395000000000E+008 

     ID 1 PGF  5688

          OBSNAME PGF1_0_5688  5.866600000000E+007 

          OBSNAME PGF1_1_5688  5.866600000000E+007 

     ID 1 PGR  5688

          OBSNAME PGR1_1_5688  5.866600000000E+007 

          OBSNAME PGR1_2_5688  5.866600000000E+007 

     ID 2 PGF  5689

          OBSNAME PGF2_0_5689  5.866600000000E+007 

          OBSNAME PGF2_1_5689  5.866600000000E+007 

     ID 2 PGR  5689

          OBSNAME PGR2_1_5689  5.866600000000E+007 

          OBSNAME PGR2_2_5689  5.866600000000E+007 

# Note that nodes 6161 and 6162 are each identified in two separate lines.

# This is because two generalized flow boundaries were defined for those nodes.

     ID 3 PGF  6161

          OBSNAME PGF3_0_6161  5.866600000000E+007 

          OBSNAME PGF3_1_6161  5.866600000000E+007 

     ID 3 PGR  6161

          OBSNAME PGR3_1_6161  5.866600000000E+007 

          OBSNAME PGR3_2_6161  5.866600000000E+007 

     ID 4 PGF  6162

          OBSNAME PGF4_0_6162  5.866600000000E+007 

          OBSNAME PGF4_1_6162  5.866600000000E+007 

     ID 4 PGR  6162

          OBSNAME PGR4_1_6162  5.866600000000E+007 

          OBSNAME PGR4_2_6162  5.866600000000E+007 

     ID 5 PGF  6161

          OBSNAME PGF5_0_6161  5.866600000000E+007 

          OBSNAME PGF5_1_6161  5.866600000000E+007 

     ID 5 PGR  6161

          OBSNAME PGR5_1_6161  5.866600000000E+007 

          OBSNAME PGR5_2_6161  5.866600000000E+007 

     ID 6 PGF  6162

          OBSNAME PGF6_0_6162  5.866600000000E+007 

          OBSNAME PGF6_1_6162  5.866600000000E+007 

     ID 6 PGR  6162

          OBSNAME PGR6_1_6162  5.866600000000E+007 

          OBSNAME PGR6_2_6162  5.866600000000E+007 

     ID 1 UGR  3125

          OBSNAME UGR1_0_3125  5.866600000000E+007 

     ID 2 UGR  3126

          OBSNAME UGR2_0_3126  5.866600000000E+007 

END IDENTIFIERS

DERIVED_OBSERVATIONS Block:

Purpose:

The DERIVED_OBSERVATIONS section is used to define how to combine multiple values extracted from the SUTRA Observations files to generate values that can be compared with observed values.

Structure:

BEGIN DERIVED_OBSERVATIONS

OBSNAME <Observation_name> [PRINT]

FORMULA <formula>

OBSNAME <Observation_name> [PRINT]

FORMULA <formula>

...

OBSNAME <Observation_name> [PRINT]

FORMULA <formula>

END DERIVED_OBSERVATIONS

Explanations:

OBSNAME is a keyword indicating that the line will specify an observation name.

<Observation_name> is the name of the observation. See the description under IDENTIFIERS.

PRINT in an optional keyword. If included, the <Observation_name> and simulated value will be printed to the extracted values file or instructions for reading the <Observation_name> and simulated value will be written to the instruction file. See the description under IDENTIFIERS.

FORMULA is a keyword indicating that the remainder of the line is a mathematical formula that evaluates to a real number. The result of the formula is the value assigned to <Observation_name>. Variables in the formula can be any of the observation names defined in the IDENTIFIERS section or any observation names defined previously in the DERIVED_OBSERVATIONS section.

<formula> is a mathematical formula that evaluates to a real number. The result of the formula is the value assigned to <Observation_name>. Variables in the formula can be any of the observation names defined in the IDENTIFIERS section or any <Observation_name>defined previously in the DERIVED_OBSERVATIONS section. The functions and operators available for use in formulas are the same as in EnhancedTemplateProcessor.

Example:

In the following example, some formulas may be printed on multiple lines because there is not enough space on a page to print them on a single line. In the actual file, however, they would each be on a single line.

BEGIN DERIVED_OBSERVATIONS

# The observation named "a" is assigned the sum of the flow rates 

# at three specified pressure nodes.

     OBSNAME a PRINT

          FORMULA PF367_4027_0 + PF387_4247_0 + PF388_4258_0

# The observation named "b" is assigned the resultant source/sink(+/-) of mass in three 

# specified pressure nodes divided by the sum of the flow rates at those nodes. Assuming fluid 

# leaves the system through all these specified pressure nodes, the result is the concentration 

# of the combined flow through those nodes.   

     OBSNAME b PRINT

          FORMULA (PR367_4027_1 + PR387_4247_1 + PR388_4258_1)/(PF367_4027_1 + PF387_4247_1 + PF388_4258_1)

# The formula used for the observation named "Well" gives the concentration of solute leaving 

# through the well (assuming the specified flow rate is negative.) Because only one node is 

# involved, an alternative would be to use a concentration (FU) observation in the 

# IDENTIFIERS section.  

     OBSNAME Wel1 PRINT

       FORMULA (FR1_4731_0)/(FF1_4731_0)

# The formula for "Wel2" simply retrieves the value from an observation defined in the 

# IDENTIFIERS section.

     OBSNAME Wel2 PRINT

          FORMULA FR1_4731_0

# The formulas for "SpecConc1" and "SpecConc2" sum the resultant solute mass flux at three 

# specified concentration nodes.

     OBSNAME SpecConc1 PRINT

          FORMULA UR1_4552_0 + UR2_4553_0 + UR3_4554_0

     OBSNAME SpecConc2 PRINT

          FORMULA UR1_4552_1 + UR2_4553_1 + UR3_4554_1

# The formula for "GenFlow1" sums the flow rates at several generalized-flow boundaries. 

# However, two of the flow rates are multiplied by 0.6. If these generalized-flow boundaries 

# represent a river but the observed value represents only 60% of the flow into or out of the river 

# at these nodes, the 0.6 factor could be used to ensure that the simulated value more closely 

# represented what was observed.

     OBSNAME GenFlow1 PRINT

          FORMULA PGF1_0_5688 + PGF2_0_5689 + PGF3_0_6161 + PGF4_0_6162 + 0.6*PGF5_0_6161 + 0.6*PGF6_0_6162

# For "GenFlow2", the formula calculates a concentration by dividing the weighted sum of the 

# resultant mass flows by the weighted sum of the fluid flows.  

     OBSNAME GenFlow2 PRINT

          FORMULA (PGR1_1_5688 + PGR2_1_5689 + PGR3_1_6161 + PGR4_1_6162 + 0.6*PGR5_1_6161 + 0.6*PGR6_1_6162)/(PGF1_1_5688 + PGF2_1_5689 + PGF3_1_6161 + PGF4_1_6162 + 0.6*PGF5_1_6161 + 0.6*PGF6_1_6162)

# "GenFlow3" represents the sum of the resultant flow through several generalized-flow 

# boundaries.    

     OBSNAME GenFlow3 PRINT

          FORMULA PGR1_2_5688 + PGR2_2_5689 + PGR3_2_6161 + PGR4_2_6162 + 0.6*PGR5_2_6161 + 0.6*PGR6_2_6162

# "GenTrans1" represents the sum of the resultant mass flows through two generalized-transport 

# boundaries.

     OBSNAME GenTrans1 PRINT

          FORMULA UGR1_0_3125 + UGR2_0_3126

# "test3" represents the difference in pressure between two pressure observations.     

     OBSNAME test3 PRINT

          FORMULA Test1_P - Test2_P

# "c" represents a difference between two previously defined derived observations. "a" is a 

# pressure observation and a concentration observation. That doesn't make any sense for a real 

# model. The modeler is responsible for ensuring that the formulas result in meaningful values.

     OBSNAME c PRINT

          FORMULA a - b

# This is another case where the values being compared have different units.     

     OBSNAME WelComp PRINT

          FORMULA Wel1 - Wel2

# "DeltaSpecConc" represents the change in resultant mass flux at a specified concentration 

# boundary at two different times.

     OBSNAME DeltaSpecConc PRINT

          FORMULA SpecConc1 - SpecConc2

# "Comp" computes the difference between a stage observation and the pressure at an 

# observation location. Assuming that SUTRA is set up so that head rather than pressure is 

# calculated, this value could be compared with an observed head gradient multiplied by the 

# distance between the lake and the observation location.    

     OBSNAME Comp PRINT

          FORMULA lakeobs - Test1_P

END DERIVED_OBSERVATIONS