Towards Accurate Loss Prediction In Kahramaa Water Stations
Abstract
Qatar General Electricity and Water Corporation (KAHRAMAA, KM) deployed
and started operation of water Supervisory Control and Data Acquisition (SCADA) system
in 2006. The aims of this SCADA system are to increase the control and pumping of water
to customers and reduce the water loss of the network. SCADA collects sensory data such
as reservoirs’ inlet and outlet flows, reservoirs’ levels, reservoirs’ inlet water stock as well
as status of valves and used pumps. These time-stamped data are periodically transmitted
to several back-end servers for logging, storing, and processing. KM water network is
composed of 35 connected stations; each includes from 3 to 12 reservoirs. Currently, KM
lacks the ability to accurately forecast any water loss in the network, except by assuming
that historical loses apply the same in future; causing inaccurate predictions.
Throughout the years, there has been an increasing interest in water loss prediction.
Different techniques are used to analyze and forecast the water loss. These techniques are
classified into three categories, which are: statistical, machine learning and hybrid
modeling approaches. Statistical approach depends on fitting mathematical models to the
observed data. However, these have a disadvantage of high noise error that prevents water
leaks to be accurately detected and forecasted. In machine learning, water loss is predicted through training of various models such as Support Vector Machines, Artificial Neural
Networks and Random Forest. The hybrid approach combines two or more techniques from
the previously mentioned approaches.
This thesis studies methods to accurately predict water loss in KM water stations.
We adopt a knowledge discovery and data mining process and activities that include data
collection, data preprocessing, feature engineering, model training, and validation. This
is the first automated attempt for KM to predict future volumes of water to be lost.
Moreover, several contributions are made to advance prediction accuracy including those
related to data preprocessing (data aggregation, cleaning, and transformation), feature
engineering (feature generation, data windowing), and model training where several
models are optimized for high accuracy using statistically reliable evaluation (crossvalidation).
Experimental results show that the highest water loss prediction accuracy of
the next hour, 12th hour, and 24th hour are 84.78%, 73.01%, and 71.66%, respectively.
These results come with different settings and parameters tuning that are optimized for
each case. Moreover, all of the above results surpass baseline models by 14.78%, 45.32%,
and 11.50%, respectively, in accuracy.
DOI/handle
http://hdl.handle.net/10576/11363Collections
- Computing [100 items ]