Your web browser is out of date. Update your browser for more security, speed and the best experience on this site.

Update your browser
Contact Us

498 Seventh Ave, 11th Floor
New York, NY 10018

(212) 539-7000

Using Machine Learning Models to Predict Wet Weather Flow

Katya Bilyk - Hazen and Sawyer

Last Modified Jul 01, 2022

Raleigh Water and Hazen are working together to develop a machine learning tool to predict influent flow to the 75 mgd Neuse River Resource Recovery Facility (NRRRF) in Raleigh, North Carolina. This tool will help the City decide when to divert flow to equalization to minimize peak flows through the plant. Such foresight can be beneficial to nearly any facility to plan for weather events and optimize treatment as minimizing the peak flow through the facility results in better treatment.

This data-driven decision-making tool aims to predict: peak hour flow, storm volume, hydrograph shape, and the optimal time to use equalization. A web-based Power BI dashboard serves as the operator interface.

The model is a Python-based neural network trained to five years of hourly flow data to predict influent flow as a function of the following explanatory variables: the past 12 hours of influent flow, streamflow data, and rainfall data. The model will utilize hourly rainfall forecasts in its predictions. Models were developed to predict flow at each time step from 1 to 72 hours into the future.

This presentation will share important lessons learned and comment on the feasibility of developing and implementing machine learning models in the water industry, which appears promising. For example, the dashboard was deployed in a test mode on December 8, 2019. The first significant storm event was predicted very accurately with actual peak flow of 102 mgd, and predicted peak flows by the model 6-, 12-, 24-, 48-, and 72-hours in advance were 104, 102, 94, 90, and 91 mgd, respectively. Storm volume was +/- 3 MG. A second significant storm event occurred in February 2020, which conveyed the largest peak flow the NRRRF has ever recorded (184 mgd). This peak was significantly underpredicted, prompting model refinement and further exploration. This increased flow is thought to reflect a major collection system improvement that was not reflected in the data used to train the model, raising a potential issue that deployed models need to address. As a point of comparison, the mechanistic collection system model was consulted, which also underpredicted the magnitude of the event. Model refinements to match the current behavior of the collection system are promising and include emphasizing the most recent storm flow events and developing an approach to maintain adaptability of the model to ongoing collection system changes. The model will be deployed in its final form in June 2020.