SPARROW VERSION 2.10 SUMMARY OF REVISIONS Updates since version 6 include: Version 10 Under version 10, all GIS capability has been removed. The residuals map is now supported as a graphics feature rather than a GIS feature. The SAS file named Conus, to be placed in the SPARROW GIS subdirectory (which could also be the SPARROW master directory), facilitates the production of the residuals map and must be identified in the control file by the control statement: %let gis_file = conus ; The DLL used to accelerate model calibration has been updated and renamed to SparrowAccumulate.dll. The revised DLL should work with 64-bit SAS. To make the DLL accessable to SPARROW you need to include the sparrow master directory, where the DLL resides, part of the Windows searchable path by doing the following: a) hit Windows key b) type ‘environment’ in the search box c) click on ‘Edit environment variables for your account’ d) add or edit the ‘PATH’ variable for your account to include the SPARROW master directory containing the software (and SparrowAccumulate.dll file). The SPARROW system has been revised to work with SAS University Edition. The principal complication is that SAS UE delineates sub-directories in path names using the forward slash (/), rather than the standard Windows backslash (\). As long as the pathname settings in the control file use the appropriate convention (backslash (\) for Windows and forward slash (/) for SAS UE), SPARROW should operate correctly. However, one complication is that SAS UE does not support DLLs, so if running SPARROW using SAS UE the DLL feature should be disabled by setting the control variable if_accumulate_with_dll to no in the control file. This will slow model calibration by about 60 percent but all functionality of SPARROW is retained. sparrow_header.sas 01/25/2017 - Control file now admits a convert_specification that allows the user to specify a conversion factor for converting between different forms of the dependent variable. Used for TSS/SSC conversion. Conversion factor specification uses similar apporach as other functional specifications. 04/03/2017 - Added the control variable gis_file, to be used to identify the SAS file containing the vertices of the basemap coverage. 04/03/2017 - Modified the setting of the main file reference at the bottom of the program to be independent of the pathname delimiter, which facilitates operation of SPARROW in the University Edition. 07/24/2018 - Added the potential to use the control variable if_pred_intervals, a yes/no control that if set to no will not produce prediction intervals as part of the bootstrap process, thereby making the bootstrap faster and require less disk space to perform the analysis. The specification is such that if the control variable is not included in the control file then the default is that intervals will be computed. sparrow_main.sas 3/24/09 - Modified main macro: added a macro condition that if_mean_adjust_delivery_vars = Yes before trying to backup the mean_delivery_vars file. 7/14/09 - Added the data set upmonload to the list of SAS files that get backed up and deleted prior to running SPARROW in prediction mode. 03/07/12 - Modified main macro: made the operations %check_station_vars and %make_nested_area conditional on &if_estimate = yes. The user may request a special prediction run that excludes all reaches with stations, causing no station_data file to be created, thereby making it impossible to chech station variables or to create nested area. 03/07/12 - Modified main macro: modified code that tested if previously created files exist when the if_make_input_data switch is set to yes. Revised code only checks for station_data file if if_estimate is set to yes. 03/12/12 - Modified main/set_libraries: macro variable if_make_input_data was erroneously named make_input_data. This affected the error reporting if the directory containing the data1 file could not be located. Macro variable has been correctly renamed to if_make_input_data. 04/10/12 - Modified set_libraries: added a conditional macro statement that invokes the filelockwait = 5 option on the libname statement if the SAS version is 9.2 or greater. 05/26/15 - Modified main: added command calling macro created in setdata_macros that checks the network for topological errors in diversion fraction and hydro sequencing/from-to nodes. 03/17/2017 - Added the macro variable pthdel representing the delimiter character (either \ or / in path names. The value is determined from the pressence of \ or / in the home_results specification. All external file references throughout SPARROW are now independent of pathname delimiters, except for the rename function used in the backup macro to backup the graphics files. 03/17/2017 - Added a macro global statement to initialize some new macro variables that were not part of the original SPARROW model but required for the revised code. 03/17/2017 - Modified the backup macro to include the capability to backup external files. 03/17/2017 - Repaced the backup of graphics catalogs with the backup of png external files created by the new SPARROW graphics component. Note, because this method uses the rename function, which requires a full pathname, it is not possible to avoid correcting for the path delimiter character when renaming external files. 04/03/2017 - Modified the method used to perform %include to avoid specifying path delimiter. 04/04/2017 - Removed NOXWAIT from the set of options. This option is no longer required because SPARROW no longer issues system commands. 04/04/2017 - Moved ODS commands from graph_resids in graphics component to main. Insures that all ODS output appears in the results window and in the results directory. 04/07/2017 - Modified the backup macro to operate without knowing a path delimeter. sparrow_makemacros.sas 01/13/11 - Modified macro makelst to automatically turn if_mean_adjust_delivery_vars to no if there are no delivery variables (dlvvar) specified. Notification of this change is printed to the comments and the SAS log. 03/09/12 - Modified makelst macro code so that if_estimate is set to no and appropriate estimation files aren't found then the analysis terminates. Previously, a WARNING was given and the analysis was placed in estimation mode, but this may not be appropriate if the user has limited the indata file for a custom prediction. 03/20/2017 - Modified macro model_summary to include convert specification and storage variable specifications, if specified in the control file. 03/20/2017 - Added code to detect if DLL is requested and user is operating with SAS University Edition. If yes to both then message is generated informing that the DLL is unavailable and if_accumulate_with_dll is set to no. 03/20/2017 - Added macro set_unique to be used in checking the specification of the storage control variables for the dynamic model. 03/20/2017 - Improved the efficiency of various looping structures in the macro makelst. 03/20/2017 - Added code in makelst to check for logical errors in the storage control variables. Catchment_storage_exclude is checked to be sure it includes the storage sources specified for the model. 03/20/2017 - Added the macro variable jCharResSrc to the global statement of makelst and appended code in that macro to determine its value. 03/30/2017 - Added GIS_file to the list of macro variable names in the %global statement. Added in case users do not specify this new variable in the control file. sparrow_calibrate.sas 6/30/10 - Modified the calibrate macro to accommodate additive constraints. Constraints are specified by creating a SAS file consisting of all coefficient names in the SPARROW model, the coefficient value being the corresponding coefficient in a linear constraint. Multiple linear constraints are represented by multiple rows in the data set. If a coefficient doesn't figure in a constraint, the value of the coefficient for that row is set to zero. The constraint file also constains the variables constraint_op and constraint_val. Constraint_op equals 0 if the constraint is a linear equality constraint, with constraint_val being the resultant value of the linear combination of the coefficients. Currently, only constraints with constraint_op = 0 are allowed. The linear constraints are applied in addition to the individual coefficient bounds identified in the betailst. The covariance matrix for the estimated coefficients is consistent with the matrix described by Gallant in Nonlinear Statistical Models. The degrees of freedom of the model are computed as the difference between the number of coefficients that are not subject to an individual coefficient bound and the number of rows of the constraint file. The constraint file is identified in the SPARROW program by the new constrol variable constraint_file, which is only operable in SAS versions 9.0 and later. Note that variance inflation factor can no longer be interpreted as factor by which t statistic will inflate if sampling could be orthogonal design. 6/30/10 - Modified code so that it is now resistant to model constraints taking up all coefficients so that jncnstrn is a null matrix (this can happen under regional stepwise procedure). 11/29/2010 - Modified code to accommodate the absence of a specified dlvdsgn matrix. The changes were to the following lines in the code: dlvdsgn = {&dlvdsgn} ; became %if %length(&dlvdsgn) > 0 %then %do ; dlvdsgn = {&dlvdsgn} ; %end ; and loc_incddsrc = loc(abs((beta0[,jbdlvvar] # data[,jdlvvar]) * {&dlvdsgn}`) > 709) ; became loc_incddsrc = loc(abs((beta0[,jbdlvvar] # data[,jdlvvar]) %if %length(&dlvdsgn) > 0 %then * {&dlvdsgn}` ; ) > 709) ; 05/11/2012 - Modified the calibrate macro: changed the name of the defined function corr to corrv to avoid a conflict with a new function introduced to SAS/IML in version 9.3. 01/25/2017 - Modified the calibrate macro: added code in the feval module that permits the specification and empirical estimation of a conversion factor that accounts for biased values of the dependent variable. The control file can now be specified with the control variable convert_specification which describes the conversion factor function, which takes as input the conversion factor coefficient (beta[,jbconvert]) and the data that identifies which monitored values are biased (data[,jifbias]). Also requires the specification of the ifbias variable in the othvar list and the specification of the bconvert variable in the bothvar list. Program containues to operate normally if the convert_specification control variable is excluded from the control file. 02/15/2017 - Modified calibrate macro: sascbtbl filename statement now references the SparrowAccumulate_at.txt file to be stored in the master directory. The new AT file references the new 64-bit DLL SparrowAccumulate.dll that contains two subroutines: tnode_a - for standard accumulation without convert specification, and tnode_a_conv - accumulates and implements a conversion factor for estimating with different types of dependent variables intended to measure the same load. The feval module includes code that estimates the conversion factor "convert" when the convert_specification is not null. A non-null convert_specification also triggers the use of the tnode_d_conv subroutine in the DLL (if if_estimate_w_dll is yes), and causes the application of the conversion factor in computing the residual and in converting the monitored load for substitution in the load accumulation process. 03/31/2017: Modified code the feval module in the calibrate macro that implements the convert feature. The new code modifies the dependent variable rather than create the convert vector. This means the revised code is supported by the original DLL subroutine tnode_a and does not require a special subroutine. 03/31/2017: Modified the feval code of the calibrate macro to accommodate channel/reservoir/catchment storage. The catchment storage modification includes a feature to recognize when the model has a feedback in the case of a seasonal model, where the input storage source for the first period is taken from the last period value. Channel/reservoir storage is not supported with this feature. The Channel/reservoir storage feature requires two loss factors, to be applied to endogenous loads determined during evaluation of the load accumulation step. The accumulation procedure under channel/reservoir storage includes the new structure ChanResSource, which is entirely determined during the accumulation process. The call to the DLL in the case of the dynamic model with channel/reservoir storage implements a new subroutine that recognizes the additional input represented by the number of reaches for a single period, the loss factors, and includes the modification that internally computes the ChanResSource. Other modifications were made to the code that stages the model, prior to model estimation, to include the cutoff i at which it is no longer necessary to carry channel/reservoir storage into the future (maxistore) to the list of dimensional variables included in the ndef vector passed to the DLL. Also, computed are the JChanresLoss and jChanResSrcFuture vectors that identify which i to access to obtain the pertinent loss factor input, and the relevant source to access - either channel or reservoir - depending on whether the flowline i is a stream or reservoir. 06/07/2017: Modified feval code of the calibrate macro by adding a semi-colon after a macro conditional statement defining the appropriate SAS commands to execute when if_test_calibrate is in effect. This corrects an error that was looking for an end to close a do segment. Corrected code reads: %else e[i_obs,] = log(data[i,jdepvar] / rchld) ;; sparrow_predict.sas 12/22/08 - Modified the feval module in the predict macro: removed the line delfrac[i,] = delfrac[i,] # inc_decay[i,] ; This line causes the reported delfrac variable to include incremental decay for sources within the reach. However, that decay is already reported as part of the incremental load. Thus, multiplying incremental load by delfrac causes the incremental decay to be applied twice. Removing this command corrects the estimation of delivered load by simple multiplication of incremental flux by delfrac. 3/16/09 - Modified the code that removes the retransformation factor for monitored sites with if_adjust set to yes. The modified code now removes the retransformation factor only for the pload_total and pload_[source] values, not for the pload_nd_[] and pload_inc_[] values. This makes these predictions consistent across all reaches. Adding up properties are not affected because, with if_adjust set to yes, the pload_nd and pload_inc predictions did not add up to pload_total for monitored reaches anyway. 4/3/09 - Corrected the "matrices do not conform" error that was caused by the modification on 3/16/09. 5/21/09 - Modified the feval_predict module to reduce the cumulative loading (both total and by source) passed to downstream node if if_adjust = yes and reach is a monitored reach. The load is reduced by the factor 1/mean_exp_weighted_error so that when the retransformation factor is applied at the downstream reach it will effectively negate the application of the retransformation factor on monitored load. The implication of this change is that the retransformation factor is only applied to the incremental additions to load below a monitored reach. This corrects a previous error in which the retransformation factor and model error was inappropriately applied to monitored flux estimates causing the estimated flux downstream of a monitored reach to be overstated (by the amount of the retransformation factor times the delivered upstream monitored load) and contain excessive uncertainty. The revised code now outputs a file called upmonload which contains the estimates of delivered upstream monitored flux (adjusted for instream attenuation but unadjusted for model error). The estimate of delivered upstream monitored flux is used to make appropriate adjustments to the bootstrap estimates of model error. The delivered upstream monitored flux for the nonlinear least squares parameter estimates is now retained as standard output and is included in the results directory as the SAS file upmonload. 5/21/09 - Modified the feval_predict module in the predict macro: corrected an error in which the estimates of the incremental flux contained in the output file predict were not adjusted for instream decay within the reach the incremental flux is introduced. The modified code now reports incremental flux that correctly accounts for the partial instream decay within the reach the flux is introduced. However, it is possible to continue to express incremental flux without the partial instream decay by adding the the following command to the control file: %let if_exclude_inc_decay = yes ; This command will cause incremental flux to exclude partial decay and cause the del_frac value to include the partial decay. 07/23/2018 - added the feature that applies the convert_specification feature to predictions when if_adjust is set to yes. This means that the load that is applied in if adjust is the same as the converted load that is used for estimation, that converted value being given by the convert_specification and the estimated values of the convert coefficients. Note that the dependent variable that is included in the prediction output is valued at the unconverted value - the value in the data1 file. sparrow_compile.sas 4/3/09 - Modified variable name in label declaration for r_square_yld in macro compile_calibrate. Revised variable name is now correct for the r_square_yld variable that is stored in the summary_betaest file. 4/17/09 - Modified macros make_stats, set_model_parm_predict, compile_ci_set_storage, and compile_ci_add_data: negated the exclusion of model error for monitored reaches with if_adjust set to yes for prediction variables other than pload_[total or source]. This is now consistent with the modification made to sparrow_predict.sas dated 3/16/09. Note, to correct this problem, the boot_detail output file now includes the additional variable if_mon, which identifies a monitored reach if the if_adjust control variable is set to yes. This variable is used to effect special processing of certain variables in the sparrow_custom_predict.sas file if the reach is a monitored reach. The variables to receive special processing are now specified in the control header of the sparrow_custom_predict.sas program. 5/20/09 - Modified macros make_stats, set_model_parm_predict, compile_ci_set_storage, and compile_ci_add_data: changes were made to the estimation of model error in the case of prediction output with the macro variable if_adjust set to yes. The revised code applies model error only to the component of delivered flux in a reach that is not monitored at an upstream monitoring station. 5/20/09 - Modified the macros compile_ci_set_storage and compile_ci_add_data: revised the method for producing the boot_detail output. The revised method now includes estimates of upstream monitored flux in the case that if_adjust is set to yes and requested boot_detail output includes cumulative flux estimates. The revised code also simplifies and speeds up the creation of the boot_detail file. The output in the boot_detail file now contains adjustments for mean exponentiated error as opposed to randomly assigned exponentiated error. This necessitates a change in the intermediate processing of the boot_detail file prior to inporting the data into the sparrow_custom_predict.sas program. Additionally, the sparrow_custom_predict.sas program is modified to account for these changes. See the discussion in the document Detailed_bootstrap_output for additional details. 5/22/09 - Modified the specification of the boot_detail option. The revision allows the boot_detail control variables to be specified in the control file, rather than in the sparrow_compile.sas program. The boot_detail_reaches and boot_detail_predvars control variables can be specified anywhere in the control file, and doing so will override any specification in the sparrow_compile.sas program, although it is still possible to specify these control variables sparrow_compile.sas and not include a specification in the control file. The specification of these control variables in the control file only works if you are running SAS version 9 or higher. 7/14/09 - Corrected various errors in the estimation of the standard error and prediction intervals. 5/14/10 - Fixed bug in compile_ci_set_storage. Modified the line: call valset(varnames[ivar],pred_values_unadj[,sel_pred[ivar]]) ; to read: call valset(varnames[ivar],pred_values_unadj[&select,sel_pred[ivar]]) ; This fixes an error when compiling boot_details by a few selected reaches. Previously all reaches were being reported in boot_detail during the "set storage" phase of compilation. Now, only the selected reaches are reported. 04/03/2017 - Revised the out_tab module that creates tab-delimited output files. Method no longer depends on path delimiters in setting the file references in proc export. All out_tab function calls were modified to conform with the specification. 04/05/2017 - Revised put_comment code to disable the display of comments in a window in the SAS window. This functionality was not supported in UE and has marginal value in regular SAS. 07/24/2018 - Added feature to compile_predict macro that sets the new if_pred_intervals control variable to yes if the variable is not set in the control file. Also added a conditional statement that will implement the %compile_ci macro only if if_pred_intervals is yes. The labels for the prediction interval variables in the prediction output, set in the summarize_predict macro, are only defined if if_pre_intervals is yes. sparrow_setdata.sas 1/4/10 - Added the macro ds_var_list which returns values of a specified SAS dataset for the specified variable in the data set. This macro is called in init_beta to set the initial values of the coefficients to a temp_beta value if the numerical optimization algorithm returns an rc = -8, meaning that the estimates did not converge due to insufficient number of iterations. The modified code allows SPARROW to continue with estimation if iter = 0, jter > 0 and rc = -8 using the most updated estimates of the coefficients. The extended estimation can continue up to the value of n_extra_jter. 05/26/15 - Added macro check_network and code in setdata macro that creates the files fnodes and tnodes used to evaluate the topological consistency of the reach network by checking if the sum of the diversion fractions across all reaches having the same from node is 1, and that the maximum of the hydro sequence variable among all reaches with the same from node is less than the minimum of hydro sequence among all reaches with to node equal to the same node value. Failure to pass these checks causes the program to terminate with error messages. 04/24/16 - Corrected check_network code to work with general file. sparrow_graphs.sas 04/26/16 - Modified graph_resids by inserting a quit after every proc gplot. This corrects an error when running SPARROW in SAS 9.4 where the same resids vs. predicted yield is plotted for all three resids plots. 04/03/2017 - Graph_resids macro has been extensively revised to use ODS rather than SAS/Graph for generating output. Also, the plotting of a residual map has been added. A warning is shown if the residuals map cannot be shown because either the lat/lon variables are absent or the basemap file cannot be found or wasn't specified. 04/04/2017 - Added a path= and body= specifications to the ods html statement. 04/04/2017 - Moved ODS initialization commands to main to insure all output appears in the results window and in the results directory. 04/07/2017 - Updated the mappoints macro so that it no longer requires annotation to plot the points locations on the map. This change makes the residuals map compatible with SAS 9.3. 06/07/2017 - Modified the graph_resids macro: ls_weight in the means procedure operating on the plotdat data set was incorrectly specified as a variable and not a macro variable. Changed ls_weight to &ls_weight. Version 9 sparrow_main.sas 10/25/07 - Modified macro clear_data to delete new data set model_parm_predict. This supports the corrections to the bootstrapping of confidence intervals (see modifications to sparrow_compile). 10/25/07 - Modified main macro: added the data set mean_delivery_vars to the list of data sets to be backed up. This facilitates the inclusion of mean_delivery_vars as a permanent SPARROW output data set. sparrow_predict.sas 09/12/07 - Modified the predict macro: Placed macro variable conditioning statements so that the matrix exclude list is set to 0 if &retrans_exclude_list is null. The checks placed in the function locin were not sufficient to handle the case where &retrans_exclude_list is null. sparrow_compile.sas 10/25/07 - Made a number of corrections to the confidence interval bootstrapping algorithm. The corrections make the program fully compatible with the documentation. Created the macro set_model_parm_predict that creates the data set model_parm_predict used to evaluate the numerator of the CI. According to the documentation, the numerator does not include mean_exp_weighted_error. The inclusion of this new data set allows the evaluation of the model component of the parametric prediction (that is, without the mean_exp_weighted_error adjustment) to be done once for all subsequent evaluations of the bootstrap confidence intervals (which are done separately by prediction variable). Modified the macro compile_ci_get_bounds to use the model_parm_predict data set rather than dir_rslt.predict as the source for the parametric predictions. Modified the macros compile_ci_set_storage and compile_ci_add_data by placing a negative sign before the randomly selected bootstrap model error term. This makes the bootstrap simulation of the necessary distribution consistent with the documentation. The change has little effect if the distribution of errors is approximately symmetric. sparrow_setdata.sas 10/25/07 - MOdified the macro mean_adjust_delivery_vars to save the mean_delivery_vars data set to the permanent results directory and use this data set except in cases where the a new estimation is requested. This modification was required because if if_estimate is no and model prediction includes only a subset of the reaches used in model estimation, then computing mean delivery variables from the subset of reaches makes the means inconsistent with the means that were used to estimate the model coefficients - resulting in biased predictions. Preprocesses Updated the image processing programs: aggregate_data_from_images.sas and allocate_data_to_ws_by_image.sas. Version 8 sparrow_main.sas 2/1/07 - Modified the main macro by adding the file boot_detail to the list of SAS files to be backed up at the beginning of a SPARROW prediction run. See notes for this date in sparrow_compile.sas for a description of the contents of this file. The creation of this file is presently undocumented, and will only be activated if code is modified in the sparrow_compile.sas file. 6/6/07 - Modified the main macro to include the file error_report in the backup function. sparrow_calibrate.sas 6/6/07 - Expanded the error message given when the file error_report is created. sparrow_compile.sas 2/1/07 - Modified macro compile_ci_set_storage and compile_ci_add_data to optionally output bootstrap iteration estimates for selected prediction variables and a subset of reaches, based on the specification of the undocumented control variables boot_detail_reaches and boot_detail_predvars specified at the top of the sparrow_compile.sas file. If detailed bootstrap output is requested (see comments above the boot_detail_reaches and boot_detail_predvars control variable specifications) SPARROW creates the file bood_detail in the results directory containing detailed bootstrap iteration output for selected reaches. These files can be used to create other output prediction variables, including large basin incremental delivered flux. Also added code at the end of the summarize_predict macro to reconfigure the boot_detail file so it has columns for individual prediction variables. Version 7 sparrow_main.sas 12/13/06 - Added statement in main macro to execute the macro make_nested_area (see setdata). This adds the variable nested station area to the station_data file. 12/13/06 - Added statement in main macro to execute the macro check_station_vars (see setdata). This macro checks to see if the station variables file has any missing values for critical station variables, such as the staid or ls_weight variables. The macro also notes, but does not error flag, if a monitored reach has no lat/lon, thereby causing the flux to be dropped from the map of station residuals. sparrow_calibrate.sas 11/20/06 - Modified call symput('flag_zero_load',trim(left(1))) ; to read: call symput('flag_zero_load',trim(left(char(1)))) ; This fixes an error that ocurred due to 1 being numeric. sparrow_setdata.sas 12/13/06 - Added the new macro make_nested_area which computes the nested area for each monitoring station and merges this variable in with the station_data file. 12/13/06 - Added the new macro check_station_vars. This macro checks to see if the station variables file has any missing values for critical station variables, such as the staid or ls_weight variables. The macro also notes, but does not error flag, if a monitored reach has no lat/lon, thereby causing the flux to be dropped from the map of station residuals.