The input for PHREEQC is
arranged by keyword data blocks. Each data block begins with a line that
contains the keyword (and possibly additional data) followed by additional
lines containing data related to the keyword. The keywords that define the
input data for running the program are listed in table 1. Keywords and their associated
data are read from a database file at the beginning of a run to define the
elements, exchange reactions, surface complexation reactions, mineral phases,
gas components, and rate expressions. Any data items read from the database
file can be redefined by keyword data blocks in the input file. After the
database file is read, data are read from the input file until the first END keyword is encountered, after
which the specified calculations are performed. The process of reading data
from the input file until an END,
followed by doing the calculations, is repeated until the end of the input file
is encountered. The set of calculations, defined by keyword data blocks
terminated by an END, is termed a
“simulation”. A “run” is a series of one or more simulations that are contained
in the same input data file and calculated during the same invocation of the
program PHREEQC.
Each simulation may contain
one or more of seven types of speciation, batch-reaction, and transport calculations:
(1) initial solution speciation, (2) determination of the composition of an
exchange assemblage in equilibrium with a fixed solution composition, (3)
determination of the composition of a surface assemblage in equilibrium with a
fixed solution composition, (4) determination of the composition of a
fixed-volume gas phase in equilibrium with a fixed solution composition, (5)
calculation of chemical composition as a result of batch reactions, which
include mixing; kinetically controlled reactions; net addition or removal of
elements from solution, termed “net stoichiometric reaction”; variation in
temperature and pressure; equilibration with assemblages of pure phases,
exchangers, surfaces, and (or) solid solutions; and equilibration with a gas phase
at a fixed total pressure or fixed volume, (6) advective-reactive transport, or
(7) advective-dispersive-reactive transport. The combination of capabilities
allows the modeling of complex geochemical reactions and transport processes
during one or more simulations.
In addition to speciation,
batch-reaction, and transport calculations, the code may be used for inverse
modeling, by which net chemical reactions are deduced that account for
composition differences between an initial water or a mixture of initial waters
and a final water.
PHREEQC was designed to
eliminate some of the input errors due to complicated data formatting in Fortran-type
input files. Data for the program are free format; spaces or tabs may be used
to delimit input fields (except SOLUTION_SPREAD,
which is delimited only with tabs); blank lines are ignored. Keyword data
blocks within a simulation may be entered in any order. However, data elements
entered on a single line are order specific. As much as possible, the program
is case insensitive. However, chemical formulas are case sensitive.
The following conventions are
used for data input to PHREEQC:
Keywords --Input data blocks are identified with an initial keyword. This
word must be spelled exactly, although case is not important. Several of the
keywords have synonyms. For example, PURE_PHASES is a synonym
for EQUILIBRIUM_PHASES.
Identifiers --Identifiers are options that may be used within a keyword data
block. Identifiers may have two forms: (1) they may be spelled completely and
exactly (case insensitive) or (2) they may be preceded by a hyphen and then
only enough characters to uniquely define the identifier are needed. The form
with the hyphen is always acceptable and is recommended. Usually, the form
without the hyphen is acceptable, but in some cases the hyphen is needed to
indicate the word is an identifier rather than an identically spelled keyword;
these cases are noted in the definitions of the identifiers in the following
sections. In this report, the form with the hyphen is used except for
identifiers of the SOLUTION
keyword and the identifiers log_k and delta_h
. The hyphen in the identifier never implies that the negative of a quantity is
entered.
Chemical
equations --For aqueous, exchange, and
surface species, chemical reactions must be association reactions,
with the defined species occurring in the first position after the equal sign.
For phases, chemical reactions must be dissolution reactions with the
formula for the defined phase occurring in the first position on the left-hand
side of the equation. Additional terms on the left-hand side are allowed. All
chemical equations must contain an equal sign, “=”. In addition, left- and right-hand
sides of all chemical equations must balance in numbers of atoms of each
element and total charge. All equations are checked for these criteria at
runtime, unless they are specifically excepted. Nested parentheses in chemical
formulas are acceptable. Spaces and tabs within chemical equations are ignored.
Waters of hydration and other chemical formulas (that are normally represented
by a “ · ”, as in the formula for gypsum, CaSO 4 ·2H 2 O)
are designated with a colon (“:”) in PHREEQC (thus, CaSO 4 :2H
2 O), but only one colon per formula is permitted.
Element
names --Two forms of element names
are available (1) those beginning with an alphabetic character and (2) those
beginning with a square bracket. For form 1, an element formula, wherever it is
used, must begin with a capital letter and may be followed by one or more
lowercase letters or underscores, “_”. Numbers are not permitted, except in
parentheses for defining the redox state. In general, element names are simply
the chemical symbols for elements, which have a capital letter and zero or one
lower case letter. It is sometimes useful to define other entities as elements,
which allows mole balance and mass-action equations to be applied. Thus,
“Fulvate” is an acceptable element name, and it would be possible to define
metal binding constants in terms of metal-Fulvate complexes.
Form 2 of element names is
less restrictive than form 1. Within the square brackets, any combination of
alphanumeric characters and the characters plus, minus, equal, colon, decimal
point, and underscore can be used. The form-2 element name is case dependent,
but upper and lower case characters can be used in any position. The iso.dat
database makes extensive use of the square-bracket form for element names by
using the mass number and chemical symbol for minor-isotope definitions, such
as [13C], [15N], and [34S].
Charge on a
chemical species --The charge on a species
may be defined by the proper number of pluses or minuses following the chemical
formula or by a single plus or minus followed by an integer designating the
charge. Either of the following is acceptable, Al+3 or Al+++. However, Al3+
would be interpreted as a molecule with three aluminum atoms and a charge of
plus one.
Valence
states --Redox elements that exist
in more than one valence state in solution are identified for definition of
solution composition by the element name followed by a valence in parentheses.
Thus, sulfur that exists as sulfate is defined as S(6) and total sulfide (H
2 S, HS - , and others) is identified by S(-2). The valence
may include a decimal point. The valence number is for identification purposes
only and does not otherwise affect the calculations.
log K and
temperature dependence --The
identifier log_k is used to define the log K at 25 °C
for a reaction. The temperature dependence for log K may be defined by
the Van’t Hoff expression or by an analytical expression. The identifier delta_h
is used to give the standard enthalpy of reaction at 25 °C for a chemical
reaction, which is used in the Van’t Hoff equation. By default the units of the
standard enthalpy are kilojoule per mole (kJ/mol). Optionally, for each
reaction the units may be defined to be kilocalorie per mole (kcal/mol). An
analytical expression for the temperature dependence of log K for a
reaction may be defined with the -analytical_expression
identifier. Up to six numbers may be given, which are the coefficients for the
equation: , where T is in kelvin. A log K is defined either with log_k
or -analytical_expression (default log_k is
zero); the enthalpy is optional (default is zero). If present, an analytical
expression is used in preference to the log_k and enthalpy
values for calculation of the log K at the specified temperature.
Pressure
dependence of log K --Pressure dependency of
reaction constants for species, and the pressure-dependent solubilities of
minerals and gases, are calculated from the volume change of the reaction. The
molar volume of solids and parameters for calculating the molal volume of aqueous
species are defined in Amm.dat, phreeqc.dat, and pitzer.dat.
Comments --The “#” character delimits the beginning of a comment in the input
file. All characters in the line that follow this character are ignored. If the
entire line is a comment, the line is not echoed to the output file. If the
comment follows input data on a line, the entire line, including the comment,
is echoed to the output file. The “#” is useful for adding comments explaining
the source of various data or describing the problem setup. In addition, it is
useful for temporarily removing lines from an input file.
Logical
line separator --A semicolon (“;”) is
interpreted as a logical end-of-line character. This allows multiple logical
lines to be entered on the same physical line. For example, solution data could
be entered as:
pH 7.0; pe 4.0; temp 25.0
on one line. The semicolon should not be used in character fields,
such as the title or other comment or description fields.
Logical line continuation --A
backslash (“\”) at the end of a line may be used to merge two physical lines into
one logical line. For example, a long chemical equation could be entered as:
Ca0.165Al2.33Si3.67O10(OH)2 + 12 H2O = \
0.165Ca+2 + 2.33 Al(OH)4- + 3.67 H4SiO4 + 2 H+
on two lines. The program would interpret this sequence as a
balanced equation entered on a single logical line. For a line to be logically
continued, the backslash must be the last character in the line except for
white space.
Repeat count --An
asterisk (“*”) can be used to indicate a repeat count for the data item that
follows the asterisk. The format is an integer followed directly by the
asterisk, which is followed directly by a numeric value. For example “4*1.0” is
the same as entering four values of 1.0 (“1.0 1.0 1.0 1.0”). Repeat counts can
be used for specifying data for the identifiers -length and -dispersivity
in the TRANSPORT data block and
for specifying reaction steps in the REACTION
and KINETICS data blocks.
Range of integers --A hyphen
(“-”) can be used to indicate a range of integers for the keywords with an
identification number (for example, SOLUTION
2-5). It is also possible to define a range of cell numbers for the identifiers
-print_cells and -punch_cells in the ADVECTION and TRANSPORT data blocks and in the
options for the COPY, DELETE, DUMP, and RUN_CELLS data blocks. A range of
integers is given in the form m-n
, where m and n are positive integers, m
is less than n , and the two numbers are separated by a hyphen without
intervening spaces.
Special characters --A
summary of all of the special characters used in PHREEQC formatting is given in
table 2.
The numerical algorithm of PHREEQC requires that chemical
equations be written in a particular form. Internally, every equation must be
written in terms of a minimum set of chemical species; essentially, one species
for each element or valence state of an element. For the program PHREEQE, these
species were called “master species” and the reactions for all aqueous
complexes had to be written using only these species. PHREEQC also needs
reactions in terms of master species; however, the program contains the logic
to rewrite the input equations into this form. Thus, it is possible to enter an
association reaction and log K for an aqueous species in terms of any aqueous
species in the database (not just master species), and PHREEQC will rewrite the
equation to the proper internal form.
PHREEQC also will rewrite reactions for phases, exchange
complexes, and surface complexes. Reactions are required to be dissolution
reactions for phases and association reactions for aqueous, exchange, or
surface complexes. Dissolution reactions for phases allow inclusion of names of
solids and gases in the equations, provided they are appended with the strings
“(s)” and “(g)”; for example,
CaCO2[18O](s) + H2O(l) = H2[18O](aq) +
Calcite(s).
The string “(l)” can be appended to the water formula and “(aq)”
to aqueous species for clarity, but they are not required. The “(s)” and “(g)”
suffixes cause the program to look in the list of phases to find equations that
can be used to reduce the original equation to an equation that contains
exclusively aqueous species. This capability to use solids and gases in
chemical reactions for phases was implemented primarily to simplify the
definition of equations for isotopic solid and gas components. The log Ks for
these isotopic species often depend on the log K for the predominant isotopic
species (solid or gas) offset by a fractionation factor and (or) a
symmetry-derived log K. The inclusion of gases and solids in the equations for
isotopic solids and gases is a straightforward method to define these
dependencies of the isotopic species equilibrium constant on the equilibrium
constant for the predominant isotopic species. In the example given here, the
equilibrium constant for the single oxygen-18 form of calcium carbonate solid
depends on the equilibrium constant of the pure carbon-12, oxygen-16 form of
calcite, which is specified by “Calcite(s)” in the example equation and refers
to the equation and log K defined for the calcite phase.
There is one major restriction on the rewriting capabilities for
aqueous species. PHREEQC calculates mole balances on individual valence states
or combinations of valence states of an element for initial solution
calculations. It is necessary for PHREEQC to be able to determine the valence
state of an element in a species from the chemical equation that defines the
species. To do this, the program requires that only one aqueous species of an
element valence state is defined by the electron half-reaction that relates it
to another valence state. The aqueous species defined by this half-reaction is
termed a “secondary master species”; there must be a one-to-one correspondence
between valence states and secondary master species and the coefficient of the
newly defined species must be one. In addition, there must be one “primary
master species” for each element, such that reactions for all aqueous species
for an element can be rewritten in terms of the primary master species. The
equation for the primary master species is simply an identity reaction. If the
element is a redox element, the primary master species must also be a secondary
master species. For example, to be able to calculate mole balances on total
iron, total ferric iron, or total ferrous iron, a primary master species must
be defined for Fe (iron) and secondary master species must be defined for
Fe(+3) (ferric iron) and Fe(+2) (ferrous iron). In the default databases, the
primary master species for Fe is Fe +2 , the secondary master
species for Fe(+2) is Fe +2 , and the secondary master species for
Fe(+3) is Fe +3 . The correspondence between master species and
elements and element valence states is defined by the SOLUTION_MASTER_SPECIES
data block, which for iron in phreeqc.dat is as follows:
SOLUTION_MASTER_SPECIES
Fe Fe+2 0.0 Fe 55.847
Fe(+2) Fe+2 0.0 Fe
Fe(+3) Fe+3 -2.0 Fe
The line with “Fe” (without parentheses) defines the primary
master species, and the last two lines, which have parentheses following “Fe”,
define the secondary master species. The chemical equations for the master
species and all other aqueous species are defined by the SOLUTION_SPECIES
data block.
The descriptions of keywords and their associated input are now
described in alphabetical order as listed in table 1. Several formatting
conventions are used to help the user interpret the input requirements. In this
report, keywords are always capitalized and bold. Words in bold must be
included literally when creating input files (although upper and lower case are
interchangeable and optional spellings may be permitted). “Identifiers” are
additional keywords that apply only within a given keyword data block; they can
be considered to be sub-keywords or options. Although identifiers are case
independent, lowercase bold is used in this report for all identifiers except pH
, -Donnan , -multi_D , and -interlayer_D
, for which mixed case is used. “ temperature ” is an
identifier for SOLUTION input. Each identifier may have two
forms: (1) the identifier word spelled exactly (“ temperature
”, in this case), or (2) a hyphen followed by a sufficient number of characters
to define the identifier uniquely (for example, -t for
temperature in SOLUTION the data block.). The form with the
hyphen is recommended. Words in italics are input values that are
variable and depend on user selection of appropriate values. Items in brackets
([ ]) are optional input fields. Mutually exclusive input fields are enclosed
in parentheses and separated by the word “or”. In general, the optional fields
in a line must be entered in the specified order, but it is sometimes possible
to omit intervening fields. For clarity, commas sometimes are used to delimit
input fields in the explanations of data input; however, commas are not allowed
in the input data file except in Basic programs; in all other cases, only white
space (spaces and tabs) may be used to delimit fields in input files. Where
applicable, default values for input fields are stated.
When the program PHREEQC is invoked, two files are used to define
the thermodynamic model and the types of calculations that will be done, the
database file and the input file. The database file is read once (to the end of
the file or until an END keyword is encountered) at the
beginning of the program. The input file is then read and processed simulation
by simulation (as defined by END keywords) until the end of
the file. The formats for the keyword data blocks are the same for either the
input file or the database file.
The database file is used to define static data for the
thermodynamic model. Although any keyword data block can occur in the database
file, normally, the file contains the keyword data blocks: EXCHANGE_MASTER_SPECIES
, EXCHANGE_SPECIES
, PHASES , RATES , SOLUTION_MASTER_SPECIES
, SOLUTION_SPECIES , SURFACE_MASTER_SPECIES ,
and SURFACE_SPECIES . These keyword data blocks define rate
expressions, master species, and the stoichiometric and thermodynamic
properties of all of the aqueous phase species, exchange species, surface
species, and pure phases.
Nine database files are provided with the program: (1)
phreeqc.dat, a database file derived from PHREEQE (Parkhurst and others, 1980),
which is consistent with wateq4f.dat, but has a smaller set of elements and
aqueous species (table 3); (2)
Amm.dat is the same as phreeqc.dat, except that ammonia redox state has been
decoupled from the rest of the nitrogen system; that is, ammonia has been
defined as a separate component; (3) wateq4f.dat, a database file derived from
WATEQ4F (Ball and Nordstrom, 1991); (4) llnl.dat, a database file derived from
databases for EQ3/6 and Geochemist’s Workbench that uses thermodynamic data
compiled by the Lawrence Livermore National Laboratory; (5) minteq.dat, a
database derived from the databases for the program MINTEQA2 (Allison and
others, 1990); (6) minteq.v4.dat, a database derived from MINTEQA2 version 4
(U.S. Environmental Protection Agency, 1998); (7) pitzer.dat, a database for
the specific-ion-interaction model of Pitzer (Pitzer, 1973) as implemented in
PHRQPITZ (Plummer and others, 1988); (8) sit.dat, a database implementing the
Specific ion Interaction Theory (SIT) as described by Grenthe and others
(1997); and (9) iso.dat, a partial implementation of the individual component
approach to isotope calculations as described by Thorstenson and Parkhurst
(2002, 2004). The elements and element valence states, corresponding notation,
and default formula used to convert mass concentration to mole concentration
units in the database phreeqc.dat are listed in table 3. Other databases may use
different sets of elements, different notation for the element names, or
different default conversion formulas.
The input data file is used (1) to define the types of
calculations that are to be done, and (2) if necessary, to modify the data read
from the database file. If new elements and aqueous species, exchange species,
surface species, or phases need to be included in addition to those defined in
the database file, or if the stoichiometry, log K , or activity
coefficient information from the database file needs to be modified for a given
run, then the keywords mentioned in the previous paragraph can be included in
the input file. The data read for these keyword data blocks in the input file
will augment or supersede the data read from the database file. In many cases,
the thermodynamic model defined in the database will not be modified, and the
above keywords will not be used in the input data file.
The place to start is with the simplest input file, which contains
only a SOLUTION data block containing the dissolved
concentrations of elements. With this input file, PHREEQC will perform a
speciation calculation and calculate saturation indices for the solution. More
complex calculations will calculate new solution compositions as a function of
reactions. Reactions can be understood as occurring in a beaker, where a
solution (as defined by a SOLUTION data block) is placed in
the beaker, and then additional reactants are added. The reactants are defined
with the keywords EQUILIBRIUM_PHASES , EXCHANGE , GAS_PHASE
, KINETICS , REACTION , SOLID_SOLUTIONS
, and SURFACE . One or more of these reactants may be
added to the beaker, and then system equilibrium is calculated, which results
in mole transfers into and out of solution, and new pH and element
concentrations. The pressure and temperature of the reaction may be defined
with REACTION_PRESSURE and REACTION_TEMPERATURE . So, the design of PHREEQC is
fairly intuitive. You must choose the composition of a starting solution and
then decide which types of reactants you need to add to the beaker to model
your system. Transport reactions are simply defined by a series of beakers,
each containing a set of reactants, and water flows and mixes from one beaker
to the next and equilibrates with the reactants in each beaker in sequence.
The concentrations of elements in solution and the mass of water
in the solution are defined through the SOLUTION or SOLUTION_SPREAD data block.
Internally, all concentrations are converted to molality and the number of
moles of each element in solution (including hydrogen and oxygen) is calculated
from the molalities and the mass of water. Thus, internally, a solution is
simply a list of elements and the number of moles of each element.
PHREEQC allows each reactant to be defined independently. In
particular, reactants ( EQUILIBRIUM_PHASES , EXCHANGE ,
GAS_PHASE , KINETICS , REACTION ,
SOLID_SOLUTIONS, and SURFACE
) are defined in terms of moles, without reference to a volume or mass
of water. Systems are defined by combining a solution with a set of reactants
that react either reversibly ( EQUILIBRIUM_PHASES , EXCHANGE
, GAS_PHASE , SOLID_SOLUTIONS,
and SURFACE ) or irreversibly ( KINETICS or REACTION
). Essentially, all of the moles of elements in the solution and the
reversible reactants are combined, the moles of irreversible reactants are
added (or removed), and a new system equilibrium is calculated. Only after
system equilibrium is calculated is the mass of water in the system known, and
only then the molalities of all entities can be calculated.
For transport calculations, each cell is a system that is defined
by the solution and all the reactants contained in keywords that bear the same
number as the cell number. The system for the cell initially is defined by the
moles of elements that are present in the solution and the moles of each
reactant. The compositions of all these entities evolve as the transport
calculations proceed.
The following sections describe the data input requirements for
the program. Each type of data is input through a specific keyword data block.
Most keywords are listed in alphabetical order within this section of the
report; however, a set of keywords most pertinent to model developers is
described in See Appendix A. Keyword
Data Blocks for Programmers. Each keyword data block may have a number of
identifiers, many of which are optional. Identifiers may be entered in any
order; the line numbers given in examples for the keyword data blocks are for
identification purposes only. Default values for identifiers are used if the
identifier is omitted.