SutraObsExtractor

SutraObsExtractor is a program for extracting simulated values from SUTRA output files at particular locations and times and printing them in a simple format. It also can create an instruction file for either PEST or UCODE. Together, these two functions can simplify the usage of PEST or UCODE with SUTRA models.

The documentation for SUTRA version 2 and 3 identifies several different types of output files. SutraObsExtractor can extract simulated values from the following file types.

‘OBC’ = “.obc” output file (observations)

‘BCOF’ = “.bcof” output file (specifications and results at fluid-source/sink nodes)

‘BCOP’ = “.bcop” output file (specifications and results at specified-pressure nodes)

‘BCOU’ = “.bcou” output file (specifications and results at specified-concentration/temperature nodes)

‘BCOPG’ = “.bcopg” output file (specifications and results at generalized-flow nodes)

‘BCOUG’ = “.bcoug” output file (specifications and results at generalized-transport nodes)

‘LKST’ = “.lkst” output file (lake stages)

SutraObsExtractor is run from the command line. The name of an input file must be supplied on the command line. There are three ways to supply the name of the input file.

SutraObsExtractor -f <filename>

SutraObsExtractor --file <filename>

SutraObsExtractor <filename>

where "<filename>" is the name of the file. Any file names that include whitespace must be enclosed in single or double quotation marks. Single or double quotation marks around other file names are optional.

The input file must contain several blocks. Each block begins with "BEGIN" followed by a keyword and ends with "END" followed by the same keyword. Keywords are case insensitive. However, in these instructions, keywords are always written in UPPER CASE letters. The keywords that identify sections are "OPTIONS", "OBSERVATION_FILES", "IDENTIFIERS", and "DERIVED_OBSERVATIONS". Any line that is empty or contains only whitespace characters is ignored. Any line whose first non-whitespace character is "#" is treated as a comment. Whitespace characters at the beginning of a line are ignored.

OPTIONS Block:

Purpose:

The OPTIONS block is used for specifying the names of output files from SutraObsExtractor.

Structure:

BEGIN OPTIONS

[LISTING <filename>]

[INSTRUCTION <filename> [<instruction_file_type>]]

[VALUES <filename>]

END OPTIONS

Explanations:

Each non-blank and non-comment line in the OPTIONS section must begin with one of the following keywords: "LISTING", "VALUES", or "INSTRUCTION". Each of these keywords must be followed by a file name. The INSTRUCTION file name may be optionally followed by either "UCODE" or "PEST".

The LISTING file is optional. If specified, it contains a record of the steps taken during execution of SutraObsExtractor. It will end either with a line indicating that it terminated normally or with an error message. The LISTING file is useful for identifying errors in the input.

Either a VALUES or INSTRUCTION file is required. Typically, only one or the other is specified, but both may be specified in the same input file.

The VALUES file will contain the simulated values extracted from the SUTRA output file or files. Each line in the VALUES file will contain an observation name followed by the simulated value associated with that name.

The INSTRUCTION file contains instructions for either UCODE or PEST to extract simulated values from the VALUES file. The desired format may be specified with <instruction_file_type>.

<filename> is the name of a file. If the file name contains whitespace characters, the file name must be surrounded by double quotes.

<instruction_file_type> must be either "UCODE" or "PEST". It indicates whether the instruction file is designed to be used with the UCODE or PEST parameter estimation programs. If <instruction_file_type> is not specified, it defaults to PEST.

Examples:

BEGIN OPTIONS

LISTING C:\ModelingTools\SutraObsExtractor\tests\SutraLake2.soeOut

INSTRUCTION C:\ModelingTools\SutraObsExtractor\tests\SutraLake2.soeIns.txt

END OPTIONS

BEGIN OPTIONS

LISTING SutraLake2.soeList

VALUES SutraLake2.soeValues

END OPTIONS

OBSERVATION_FILES Block:

Purpose:

The OBSERVATION_FILES section is to identify the files from which simulated values will be extracted.

Structure:

BEGIN OBSERVATION_FILES

FILENAME <filename> <file type>

[FILENAME <filename> <file type>]

...

[FILENAME <filename> <file type>]

END OBSERVATION_FILES

Explanations:

Each non-blank, non-comment line in the OBSERVATION_FILES group must begin with the keyword FILENAME followed by the name of the file and the file type.

<filename> is the name of a file generated by SUTRA that contains information about simulated values that can be compared with observations. If the file name contains whitespace characters, the file name must be surrounded by double quotes.

<file type> is a keyword indicating the type of SUTRA output file that will be read. <file type> must be one of the following OBC, LKST, BCOP, BCOF, BCOU, BCOPG, or BCOUG. For more information about these file types, see the documentation for SUTRA version 2.2.

Example:

BEGIN OBSERVATION_FILES

FILENAME SutraLake2_Object9.obc OBC

FILENAME SutraLake2_Object16.obc OBC

# this is a comment.

# Note that there is more than one OBC file is included in the OBSERVATION_FILES

# section.

# This is the only file type supported by SutraObsExtractor for which SUTRA will create

# more

# than one file for the same model.

FILENAME SutraLake2.lkst LKST

FILENAME SutraLake2.bcop BCOP

FILENAME SutraLake2.bcof BCOF

FILENAME SutraLake2.bcou BCOU

FILENAME SutraLake2.bcopg BCOPG

FILENAME SutraLake2.bcoug BCOUG

END OBSERVATION_FILES

IDENTIFIERS Block:

Purpose:

The IDENTIFIERS section is used to identify simulated values to be extracted from the SUTRA output files corresponding to user-specified times. These values may either represent values that should be directly compared to observed values or they may be combined with other extracted values using the methods available in the DERIVED_OBSERVATIONS section.

Structure:

BEGIN IDENTIFIERS

ID <identifier> <observation_type> [<secondary_identifier>]

OBSNAME <Observation_name> <observation_time> [PRINT]

...

OBSNAME <Observation_name> <observation_time> [PRINT]

ID <identifier> <observation_type> [<secondary_identifier>]

OBSNAME <Observation_name> <observation_time> [PRINT]

...

OBSNAME <Observation_name> <observation_time> [PRINT]

END IDENTIFIERS

Explanations:

ID is a keyword used to indicated that the following values on the line will be used to identify a particular time series from which values are to be extracted.

<identifier> is a value in the output file that identifies a particular time series. The nature of <identifier> varies depending on the type of file from which the simulated value is to be extracted.

For OBC files, <identifier> is the "Name" listed in the OBC file.

For LKST files, <identifier> is the node number listed in the LKST file.

For all other files, <identifier> is the sequence number in which the values appear in the file for each time step for which values are recorded.

<observation_type> is used to identify the type of data to be extracted. The allowed values of <observation_type> depend on the type of file from which values are to be extracted:

P: Pressure in an OBC file.

U: Temperature or concentration in an OBC file.

S: Saturation in an OBC file.

LKST: Lake stage in an LKST file.

PF: Resultant source/sink(+/-) of fluid in a BCOP file.

PU: Solute conc/temperature of fluid source/sink in a BCOP file.

PR: Resultant source/sink(+/-) of mass/energy in a BCOP file

FF: Specified flow rate in a BCOF file

FU: Solute conc/temperature of fluid source/sink in a BCOF file

FR: Resultant source/sink(+/-) of mass/energy in a BCOF file

UR: Resultant source/sink(+/-) of mass/energy in a BCOU file

PGF: Resultant source/sink(+/-) of fluid in a BCOPG file

PGU: Solute conc/temperature of fluid source/sink in a BCOPG file

PGR: Resultant source/sink(+/-) of mass/energy in a BCOPG file

UGR: Resultant source/sink(+/-) of mass/energy in a BCOUG file

UGU: Computed conc/temperature in a BCOUG file

<secondary_identifier> is a second value on a line that helps identify a particular time series. It is required for observations in BCOP, BCOF, BCOU, BCOPG, and BCOUG files. It must not be included for observations in OBC and LKST files. For the file types that require it, <secondary_identifier> must be the node number. Note that the same node number may be repeated more than once for a single time step if the same node is specified more than once for the appropriate boundary condition in the SUTRA input file. However, it is not clear that SUTRA will handle specified pressure, specified flows, or specified concentration or temperature boundary conditions that are specified more than once at the same node. It appears that generalized-flow and generalized-transport boundaries are handled appropriately if specified more than once for the same node in the input file. <secondary_identifier> serves as a check that <identifier> has been specified correctly.

After each ID line, there must be one or more OBSNAME lines. Each such line specifies an observation name and a time at which a simulated value is desired. The value will be from the time series identified in the ID line.

OBSNAME is a keyword indicating that the line will specify an observation name and time.

<Observation_name> is the name of the observation. <Observation_name> must start with a letter or the underscore character. The remaining characters in <Observation_name> must be letters, digits, or the underscore character. All observation names must be unique. SutraObsExtractor does not limit the length of observation names.

<observation_time> is a real number that indicates the time at which the simulated value is desired. If the specified time is not included in the output file, SutraObsExtractor will interpolate to the time in question from the values recorded for the preceding and following times. If the <observation_time> is before the first recorded time, it will be ignored. If it is after the last recorded time, the value for the last recorded time will be used.

PRINT is an optional keyword. If included, the <Observation_name> and simulated value will be printed to the extracted values file or instructions for reading the <Observation_name> and simulated value will be written to the instruction file. Printing the name and value implies that they will be used directly by PEST or UCODE. If the values are not printed, they may still be used in the DERIVED_OBSERVATIONS section. Regardless of whether PRINT is present or not, the name and simulated value will be written to the listing file. SutraObsExtractor does not limit the length of observation names, but to be used by PEST or UCODE, the observation name must conform to the requirements of those programs.

Example:

BEGIN IDENTIFIERS

ID Object9 P

OBSNAME Test1_P 1.000000000000E+006 PRINT

ID Object9 P

OBSNAME Test2_P 2.000000000000E+006 PRINT

ID Object16 U

OBSNAME ConcOb1_U 1.000000000000E+006 PRINT

ID Object16 U

OBSNAME ConcOb2_U 2.000000000000E+006 PRINT

ID 5435 LKST

OBSNAME lakeobs 1.600000000000E+008 PRINT

# This is an example of a comment because it starts with "#".

# Note that the simulated values for most of these observations are not printed to

# the values output file but instead are used in calculations in the

# DERIVED_OBSERVATIONS section.

ID 367 PF 4027

OBSNAME PF367_4027_0 6.048000000000E+005

OBSNAME PF367_4027_1 5.866600000000E+007

ID 367 PR 4027

OBSNAME PR367_4027_0 6.048000000000E+005

OBSNAME PR367_4027_1 5.866600000000E+007

ID 387 PF 4247

OBSNAME PF387_4247_0 6.048000000000E+005

OBSNAME PF387_4247_1 5.866600000000E+007

ID 387 PR 4247

OBSNAME PR387_4247_0 6.048000000000E+005

OBSNAME PR387_4247_1 5.866600000000E+007

ID 388 PF 4258

OBSNAME PF388_4258_0 6.048000000000E+005

OBSNAME PF388_4258_1 5.866600000000E+007

ID 388 PR 4258

OBSNAME PR388_4258_0 6.048000000000E+005

OBSNAME PR388_4258_1 5.866600000000E+007

ID 1 FF 4731

OBSNAME FF1_4731_0 1.191500000000E+008

ID 1 FR 4731

OBSNAME FR1_4731_0 1.191500000000E+008

ID 1 UR 4552

OBSNAME UR1_4552_0 1.191500000000E+008

OBSNAME UR1_4552_1 2.395000000000E+008

ID 2 UR 4553

OBSNAME UR2_4553_0 1.191500000000E+008

OBSNAME UR2_4553_1 2.395000000000E+008

ID 3 UR 4554

OBSNAME UR3_4554_0 1.191500000000E+008

OBSNAME UR3_4554_1 2.395000000000E+008

ID 1 PGF 5688

OBSNAME PGF1_0_5688 5.866600000000E+007

OBSNAME PGF1_1_5688 5.866600000000E+007

ID 1 PGR 5688

OBSNAME PGR1_1_5688 5.866600000000E+007

OBSNAME PGR1_2_5688 5.866600000000E+007

ID 2 PGF 5689

OBSNAME PGF2_0_5689 5.866600000000E+007

OBSNAME PGF2_1_5689 5.866600000000E+007

ID 2 PGR 5689

OBSNAME PGR2_1_5689 5.866600000000E+007

OBSNAME PGR2_2_5689 5.866600000000E+007

# Note that nodes 6161 and 6162 are each identified in two separate lines.

# This is because two generalized flow boundaries were defined for those nodes.

ID 3 PGF 6161

OBSNAME PGF3_0_6161 5.866600000000E+007

OBSNAME PGF3_1_6161 5.866600000000E+007

ID 3 PGR 6161

OBSNAME PGR3_1_6161 5.866600000000E+007

OBSNAME PGR3_2_6161 5.866600000000E+007

ID 4 PGF 6162

OBSNAME PGF4_0_6162 5.866600000000E+007

OBSNAME PGF4_1_6162 5.866600000000E+007

ID 4 PGR 6162

OBSNAME PGR4_1_6162 5.866600000000E+007

OBSNAME PGR4_2_6162 5.866600000000E+007

ID 5 PGF 6161

OBSNAME PGF5_0_6161 5.866600000000E+007

OBSNAME PGF5_1_6161 5.866600000000E+007

ID 5 PGR 6161

OBSNAME PGR5_1_6161 5.866600000000E+007

OBSNAME PGR5_2_6161 5.866600000000E+007

ID 6 PGF 6162

OBSNAME PGF6_0_6162 5.866600000000E+007

OBSNAME PGF6_1_6162 5.866600000000E+007

ID 6 PGR 6162

OBSNAME PGR6_1_6162 5.866600000000E+007

OBSNAME PGR6_2_6162 5.866600000000E+007

ID 1 UGR 3125

OBSNAME UGR1_0_3125 5.866600000000E+007

ID 2 UGR 3126

OBSNAME UGR2_0_3126 5.866600000000E+007

END IDENTIFIERS

DERIVED_OBSERVATIONS Block:

Purpose:

The DERIVED_OBSERVATIONS section is used to define how to combine multiple values extracted from the SUTRA Observations files to generate values that can be compared with observed values.

Structure:

BEGIN DERIVED_OBSERVATIONS

OBSNAME <Observation_name> [PRINT]

FORMULA <formula>

OBSNAME <Observation_name> [PRINT]

FORMULA <formula>

...

OBSNAME <Observation_name> [PRINT]

FORMULA <formula>

END DERIVED_OBSERVATIONS

Explanations:

OBSNAME is a keyword indicating that the line will specify an observation name.

<Observation_name> is the name of the observation. See the description under IDENTIFIERS.

PRINT in an optional keyword. If included, the <Observation_name> and simulated value will be printed to the extracted values file or instructions for reading the <Observation_name> and simulated value will be written to the instruction file. See the description under IDENTIFIERS.

FORMULA is a keyword indicating that the remainder of the line is a mathematical formula that evaluates to a real number. The result of the formula is the value assigned to <Observation_name>. Variables in the formula can be any of the observation names defined in the IDENTIFIERS section or any observation names defined previously in the DERIVED_OBSERVATIONS section.

<formula> is a mathematical formula that evaluates to a real number. The result of the formula is the value assigned to <Observation_name>. Variables in the formula can be any of the observation names defined in the IDENTIFIERS section or any <Observation_name>defined previously in the DERIVED_OBSERVATIONS section. The functions and operators available for use in formulas are the same as in EnhancedTemplateProcessor.

Example:

In the following example, some formulas may be printed on multiple lines because there is not enough space on a page to print them on a single line. In the actual file, however, they would each be on a single line.

BEGIN DERIVED_OBSERVATIONS

# The observation named "a" is assigned the sum of the flow rates

# at three specified pressure nodes.

OBSNAME a PRINT

FORMULA PF367_4027_0 + PF387_4247_0 + PF388_4258_0

# The observation named "b" is assigned the resultant source/sink(+/-) of mass in three

# specified pressure nodes divided by the sum of the flow rates at those nodes. Assuming fluid

# leaves the system through all these specified pressure nodes, the result is the concentration

# of the combined flow through those nodes.

OBSNAME b PRINT

FORMULA (PR367_4027_1 + PR387_4247_1 + PR388_4258_1)/(PF367_4027_1 + PF387_4247_1 + PF388_4258_1)

# The formula used for the observation named "Well" gives the concentration of solute leaving

# through the well (assuming the specified flow rate is negative.) Because only one node is

# involved, an alternative would be to use a concentration (FU) observation in the

# IDENTIFIERS section.

OBSNAME Wel1 PRINT

FORMULA (FR1_4731_0)/(FF1_4731_0)

# The formula for "Wel2" simply retrieves the value from an observation defined in the

# IDENTIFIERS section.

OBSNAME Wel2 PRINT

FORMULA FR1_4731_0

# The formulas for "SpecConc1" and "SpecConc2" sum the resultant solute mass flux at three

# specified concentration nodes.

OBSNAME SpecConc1 PRINT

FORMULA UR1_4552_0 + UR2_4553_0 + UR3_4554_0

OBSNAME SpecConc2 PRINT

FORMULA UR1_4552_1 + UR2_4553_1 + UR3_4554_1

# The formula for "GenFlow1" sums the flow rates at several generalized-flow boundaries.

# However, two of the flow rates are multiplied by 0.6. If these generalized-flow boundaries

# represent a river but the observed value represents only 60% of the flow into or out of the river

# at these nodes, the 0.6 factor could be used to ensure that the simulated value more closely

# represented what was observed.

OBSNAME GenFlow1 PRINT

FORMULA PGF1_0_5688 + PGF2_0_5689 + PGF3_0_6161 + PGF4_0_6162 + 0.6*PGF5_0_6161 + 0.6*PGF6_0_6162

# For "GenFlow2", the formula calculates a concentration by dividing the weighted sum of the

# resultant mass flows by the weighted sum of the fluid flows.

OBSNAME GenFlow2 PRINT

FORMULA (PGR1_1_5688 + PGR2_1_5689 + PGR3_1_6161 + PGR4_1_6162 + 0.6*PGR5_1_6161 + 0.6*PGR6_1_6162)/(PGF1_1_5688 + PGF2_1_5689 + PGF3_1_6161 + PGF4_1_6162 + 0.6*PGF5_1_6161 + 0.6*PGF6_1_6162)

# "GenFlow3" represents the sum of the resultant flow through several generalized-flow

# boundaries.

OBSNAME GenFlow3 PRINT

FORMULA PGR1_2_5688 + PGR2_2_5689 + PGR3_2_6161 + PGR4_2_6162 + 0.6*PGR5_2_6161 + 0.6*PGR6_2_6162

# "GenTrans1" represents the sum of the resultant mass flows through two generalized-transport

# boundaries.

OBSNAME GenTrans1 PRINT

FORMULA UGR1_0_3125 + UGR2_0_3126

# "test3" represents the difference in pressure between two pressure observations.

OBSNAME test3 PRINT

FORMULA Test1_P - Test2_P

# "c" represents a difference between two previously defined derived observations. "a" is a

# pressure observation and a concentration observation. That doesn't make any sense for a real

# model. The modeler is responsible for ensuring that the formulas result in meaningful values.

OBSNAME c PRINT

FORMULA a - b

# This is another case where the values being compared have different units.

OBSNAME WelComp PRINT

FORMULA Wel1 - Wel2

# "DeltaSpecConc" represents the change in resultant mass flux at a specified concentration

# boundary at two different times.

OBSNAME DeltaSpecConc PRINT

FORMULA SpecConc1 - SpecConc2

# "Comp" computes the difference between a stage observation and the pressure at an

# observation location. Assuming that SUTRA is set up so that head rather than pressure is

# calculated, this value could be compared with an observed head gradient multiplied by the

# distance between the lake and the observation location.

OBSNAME Comp PRINT

FORMULA lakeobs - Test1_P

END DERIVED_OBSERVATIONS