Water Resources Research Act Program

Details for Project ID 2008OH64B

Competitive Learning to Develop a Biomarker Forecasting Tool for Classifying Recreational Water Quality

Institute: Ohio
Year Established: 2008 Start Date: 2008-07-01 End Date: 2009-06-30
Total Federal Funds: $28,248 Total Non-Federal Funds: $56,496

Principal Investigators: Dominic Boccelli

Abstract: Recreational users of urban-influenced surface waters, such as in the Cincinnati region, can be exposed to unhealthy levels of microbial contamination due to increased runoff resulting from greater impervious surface area and direct discharge of combined sewer and/or storm water into surface waters. Unfortunately, laboratory testing requires, at a minimum, 24 hours for analysis and reporting thereby eliminating the possibility of alerting the public to potentially unsafe conditions in a timely fashion. The objective of this study is to develop a classification tool that accurately identifies microbial outbreaks using readily available data to provide engineers, managers, regulators, and public health officials an opportunity to inform the populace regarding the public health status of recreational waters in an almost real-time environment. The use of Linear Vector Quantization (LVQ) is proposed to develop a water quality classification tool for use in a regional Recreation Management Program. Unlike most data driven tools that predict concentration first, which is then used to classify the water quality, the LVQ approach is a statistical classification approach intended to classify the water quality directly from readily available hydrologic and meteorologic data. By eliminating the need to predict microbial concentrations, the uncertainties associated with the predictions have also been eliminated. The LVQ algorithm will be developed using water quality, meteorologic, and hydrologic data associated with the Ohio River and three of its tributaries located in the Cincinnati region. Water quality measurements were performed at eighteen different locations in the system over 1-1/2 recreational seasons (May - Oct) and will be paired with available hydrologic and meteorologic data. The explanatory variables to be considered include precipitation event characteristics (duration, intensity, and total volume), number of preceding dry weather days, and total previous rainfall. Initial data exploration will be performed to evaluate pertinent explanatory variables. The resulting classification performance of the LVQ algorithm will be compared to more typical approaches (multivariate linear regression and neural networks) based on the same data set. Since water quality classification is the most important aspect of these algorithms, the evaluation will be focused on comparing correct classifications, with particular emphasis on the true-positive and false-negative rates. The best performing algorithm will be developed into a tool capable for near-time use in predicting water quality associated with the local riverine system.