Flood Frequency Analysis of the Mersey River, Nova Scotia using Bivariate Extreme-Value Distribution
Author: Zak Zakeria, P.Eng., and Perry Mitchelmore, P.Eng., PMP, FEC, Published: Canadian Dam Association
Floods are random multivariate events adequately described by their peak flows, volumes, durations and shapes. Generally flood variables are mutually correlated, flood peak to volume and flood volume to duration while insignificant correlation exists between flood peak and flood duration. Severity of a flood event is not only attributed to its peak flow but also to its other characteristics such as volume and time to peak. A flood event with a late peak may be more severe than a flood with an early peak in case of some reservoirs. Extensive research has been done to represent floods as univariate events by addressing only peak flows. Examples of such studies are Kite, 1978; Cunane, 1985, 1987 & 1989; Bobee and Ramussen, 1994; and Rao and Hamid, 2000.
Fewer studies have been attempted to represent all the characteristics of flood events including peak flow, volume, duration and shape. Bivariate and trivariate frequency distributions have been used by researchers to describe a complete flood event. Yue et al. (1999) applied the Gumbel mixed model to represent the joint and conditional probabilities of flood peaks and volumes, and flood volumes and durations in Quebec, Canada. Yue (2000) employed the bivariate lognormal distribution to describe the joint distributions of correlated flood peaks and volumes, and correlated flood volumes and durations. Yue (2001) also, demonstrated the applicability of Gumbel logistic model to describe joint probability distributions of flood peak and volume, and flood volume and duration. Karmakar et al. (2008) considered both parametric and non-parametric distribution functions to determine the marginal distribution functions for peak flow, volume and duration. However, the above studies provide insufficient information to construct a flood hydrograph. Mediero et al. (2010) developed a methodology for generating a set of annual maxima synthetic hydrographs that preserve the marginal distributions of the peaks, volumes and durations.
The shape elements of a flood hydrograph include rising limb, peak flow, inflection point and recession curve (Shaw, 1988). Several authors have addressed the shape elements of the hydrograph. Chow (1968) considered the construction of a triangular hydrograph form a curvilinear unit hydrograph and expressed time to peak (Tp) as function of flood duration (D) using Tp=D/2.67. Yue et al. (2002) described the shape of a flood hydrograph by a two-parameter beta probability density function using two variables namely shape mean and shape variance. Aldama (2002) developed a parameterization technique to construct flood hydrograph based on a family of odd degree Hermitian polynomials on peak flow, volume and time to peak and Tardif et al. (2009) found that flood hydrograph shape statistics including shape mean, shape variance, rising and falling slopes varied significantly among sites in the same category.
This paper applies the Gumbel mixed model to Mersey River Watershed in the province of Nova Scotia. The mixed model is a bivariate extreme value distribution model with Gumbel marginals (Gumbel, 1960) to estimate the joint probability distributions and conditional probability functions of flood peaks and volumes, and flood volumes and durations and related return periods. The study also develops a linear model to represent the shape of a flood event based on its flood peak, volume and duration.
2 BIVARIATE DISTRIBUTION
The multivariate concept to extreme value distributions was introduced by Finkelstein (1953), Gumbel (1958) and Tiago de Olivera (1958). Raynal-Villasenor et al. (1987) classified the multivariate extreme value distributions and described applications of the bivariate distributions. The Gumbel mixed model (Gumbel 1960) is the bivariate extreme value distribution with Gumbel marginals that could be applied to estimate joint and conditional probabilities of correlated flood peaks and volumes, and correlated flood volumes and durations. The Gumbel mixed model is generally expressed by the following joint cumulative distribution.
Where is the association parameter describing the relationship between the two random variables X and Y, and F(x) and F(y) are the marginal distribution functions for the random variables respectively and are described by the following equations (Extreme value type-1 distribution forms):
For the independent case =0, the bivariate distribution splits into the product of its marginal distribution functions and is given by
Oliveria (1975, 1982) described the following analytical expression for estimate of the associtation parameter ().
Where is the population product-moment correlation coefficient that enables the Gumbel mixed model valid in the rang of 0<2/3 and is described by
Where µ and are the sample location and scale of the random variables X and Y respectively. When = , the association parameter () reaches its lowerlimit of 0, which is the independent, case and when +2/3, becomes equal to 1 indication a dependent case. This shows that the model is invalid when >2/3. Expression for the joint probability density function is mentioned in Yue, et al. (1999).
The conditional probability distribution function of X given Y is given by
The expression for the condition probability distribution function of Y given X will be similar to the one above. The joint and conditional return periods for various quantiles of the variables X and Y will be the inverse of the respective probability distributions.
3 HYDROGRAPH TIME TO PEAK
Like other variables of flood hydrograph, the shape is also a random event. A simplified representation of the flood hydrograph shape as a triangular hydrograph is considered in this study using the time to peak variable. Hydrograph time to peak is a critical variable of a flood hydrograph particularly in studies of reservoir routing analysis. In unit hydrograph theory, the time to peak is equivalent to the sum of the basin lag and one-half the duration of excessive rainfall (Chow, 1968). In general, the time to peak of a flood hydrograph has a positive correlation with both flood volume and duration, and negative correlation with flood peak. Based on the observed data a regression equation is developed here that relates hydrograph time to peak to flood peak, volume and duration and has a acceptable accuracy.
This study explores the utility of the Gumbel mixed model at the Mersey River watershed in Nova Scotia for designing flood events in terms of their flood peaks, volumes, durations and assess the flood hydrograph shape as triangular hydrograph using the time to peak parameter.
4.1 Study Watershed
The Mersey River watershed is located in the Queens region of the province of Nova Scotia, Canada. Drainage area of the river at the gauging station below Mill Falls (WSC station ID: 01ED007) is 295 km2. A flow record of daily discharges is available for 39 years (1969 to 2007) at the station. The modified z-score method indicates one potential outlier in the observed data reducing the available record length to 38 years. In mainland Nova Scotia, floods usually occur in the winter/spring period with heaviest floods both in flood peak and flood volume when rainstorms combine with snowmelt. During the 39 years of flow monitoring approximately 97% of the annual maximum flood events were observed during the winter/spring period. Annual maximum instantaneous flows and daily flows were observed almost at the same date with instantaneous flows at an average 2% more than the annual maximum daily flows, a characteristic of the snowmelt regions.
4.2 Characteristics of Flood Events
Flood events are adequately characterized by their peaks, volumes, durations and shapes. Duration of a flood depends on the rising and falling or recession limb of the surface runoff portion of flood hydrograph and constitutes that total time marked by a rise in stage and discharge from base flow and a return to base flow. The rising limb of a flood hydrograph is generally steeper than the recession limb which is also the case for the study watershed. Flood volume (V) and duration (D) series are constructed from the start date (SD) and end date (ED) of each event. The series are described as below for the ith year and jth day (Yue, 1999).
Where Q is and Qie are the recorded daily flow on the start and end date of flood event for the ith year respectively.
4.3 Gumbel Marginals of Flood Peak, Volume and Duration
The probability of exceedence, P(x), for each flood variable is estimated using the Gringorten formula, the unbiased plotting position formula for estimating EVI probabilities (Gringorten 1963; Cunnane 1978 and Guo 1990). For the observed flood data sorted in ascending order, the formula for the non-exceedance probability is given as below.
Where r is the rank of variable x and N is the total number of observed values. To test the goodness of fit of the EVI, the Anderson-Darling (AD) test (Stephens, 1974) is performed to know if each of the three variable series came from a population with the Gumbel frequency distribution (EVI). The AD test statistics for the flood peak, volume and duration are 0.215, 0.512 and 0.490 respectively. The critical AD test statistic is 0.757 with the sample size of 38 and confidence limit of 95% indicating that the three characteristics of flood events can be represented by EVI distribution. Figure 1 illustrates the observed distribution of Gringorten formula and EVI distribution fit to each of the flood variables data. Location parameter and scale parameter of the EVI distribution are defined by
Where µ and are the mean and standard deviation of the observed data sample. Table 1 presents the mean, standard deviation and distribution parameters for each series of the flood variables.
Figure 1: Distribution of (a) flood peaks (b) flood volumes and (c) flood durations
The limit for the association between any two variables of a flood event is 0<<0.667. For the Mersey River station the computed product-moment correlation coefficient between flood peak and volume, flood volume and duration, and flood peak and duration are 0.653, 0.578 and -0.001 respectively indicating correlation between flood peak and volume, and flood volume and duration but no correlation between flood peak and duration. High association parameter values between flood peak and volume (=0.98), and flood volume and duration (=0.88), suggest that the bivariate model can be applied to analyze the joint probability distributions of flood peaks and volumes, and flood volumes and durations.
Table 1: Statistical and EVI parameters of the observed flood events
4.4 Joint and Conditional Probabilities of Flood Peaks and Volumes
Validity of the Gumbel mixed model is assessed by calculating and comparing observed and theoretical joint probabilities. Theoretical joint probabilities are computed by using equation (1) for the flood peak and volume series with the data sorted by peak and ascending order while observed joint probabilities are calculated using the Gringorten plotting position formula expressed by the following equation (Yue, 1999) for the bivariate distribution.
Results of the observed and theoretical joint probabilities for flood peaks and volumes presented in Figure 2 are comparable, which suggest that the bivariate model is an appropriate for estimating joint probabilities of the flood peaks and volumes. The non-exceedance joint probability or joint cumulative distribution functions and corresponding joint return periods of flood peaks and volumes are illustrated in Figure 3.
Figure 2: Observed and theoretical joint probabilities of flood peaks and volumes
The conditional probabilities of flood peaks given volumes and flood volumes given peaks are estimated by employing equation (6) and the corresponding conditional return periods will be equal to the inverse of the computed probabilities. Figure 4 presents the conditional return periods of flood peaks given volumes as well as conditional return periods of flood volumes given peaks.
4.5 Joint and conditional probabilities of flood volumes and durations
Validity of the mixed model for analyzing flood volumes and durations is assessed by calculating and comparing the observed and theoretical joint probabilities. Theoretical joint probabilities and observed joint probabilities for flood volumes and durations are computed in the same manner as for the flood peaks and volumes. Results of the observed and theoretical joint probabilities for flood volumes and durations presented in Figure 5 are in close agreement suggesting that the bivariate model is an appropriate for estimating joint probabilities of the flood volumes and durations. The non-exceedance joint probability and corresponding joint return periods of flood volumes and durations are illustrated in Figure 6.
The conditional probability and corresponding return period of a flood volume given duration and flood duration given volumes are estimated in the same manner as mentioned earlier. Figure 7 presents the conditional return periods of flood volumes given durations as well as conditional return periods of flood durations given volumes.
The Gumbel mixed model application to the Mersey River indicates that the bivariate distribution model is valid and can be useful to predict different combinations of flood peak, volume and duration for a design return period.
Figure 3: (a) Joint cumulative distribution functions and (b) Joint return periods of flood peaks and volumes
4.6 Hydrograph Time to Peak
The bivariate frequency analysis provides estimate of flood peak, volume and duration for a design quantile but prohibit constructing a flood hydrograph and locating its peak temporally within the storm duration. Therefore, a simplified representation of the flood hydrograph shape as a triangular hydrograph is considered using the time to peak variable. Based on the observed data, a regression equation is developed that relates hydrograph time to peak (Tp) to flood peak (Q), volume (V) and duration (D) and is given by the following expression.
The relation as illustrated in Figure 8 has an acceptable accuracy measure with a coefficient of determination of 0.7 and can be useful to estimate the time to peak based on a given set of flood peak, volume and duration for a design event.
Figure 4: (a) Conditional return period of flood volume given peak (TV|Q) and (b) conditional return period of flood peak given volume (TQ|V).
Figure 5: Observed and theoretical joint probabilities of flood volumes and durations
Figure 6: (a) Joint cumulative probability distribution functions P(V
Figure 7: (a) Conditional return period of flood volume given duration (TV|D) and (b) conditional return period of flood duration given volume (TD|V)
This study presents application of the Gumbel mixed model, a bivariate extreme value distribution, to the Mersey River watershed located in the province of Nova Scotia, Canada and adopts a simple approach for estimating the flood hydrograph shape. Results indicate that the EVI distribution adequately represents the marginal distributions of the observed flood variables. Theoretical and empirical joint probabilities for flood peaks and volumes and also for flood volumes and durations are in close agreement and the bivariate model is valid for joint probabilities and corresponding return periods of flood peaks and volumes, and flood volumes and durations. The model can be also useful to obtain conditional probabilities and corresponding return periods.
In single variable flood frequency analysis, flood volumes and shapes are generally estimated by arbitrary means. The bivariate extreme value analysis also provides insufficient information to construct a flood hydrograph. A simplified representation of the flood hydrograph shape as a triangular hydrograph is proposed here using the time to peak parameter. For a given set of flood peak, volume and duration of a design guantile, the time to peak is computed using least square equation based on the observed flood data.
It is concluded that the model allows estimation of various possible combinations of flood peaks, volumes and duration for annual exceedance probabilities or return periods and may prove useful in the analysis of risks linked with the water reservoirs in the region.
Mr. Phil Helwig, Senior Hydrotechnical Engineer and Dr. Onita Basu of Carleton University are acknowledged for their invaluable comments and suggestions in improvement of the paper.
Bobee, B. and Ramussen, P.F., 1994. "Statistical analysis of annual flood series. In: Trends in Hydrology", ed., by J. Menon, Council of Scientific Research Integration, India, 117-135
Chow, V.T., 1968. " Handbook of applied hydrology - a compendium of water resources technology".
Cunnane, C., 1978. "Unbiased plotting positions: A review. Journal of hydrology:, 37 (3/4), 205-222
Cunnane, C., 1985. "Factors affecting choice of distribution for flood series:, Journal of journal of hydrologic sciences, 30 (1,3), 25-36.
Cunnane, C., 1987. "Review of statistical models for flood frequency estimation:, In: V.P. Singh, ed. Hydrologic frequency modeling. Dordrecht, the Netherlands: 49-95
Cunnane C., 1989. "Statistical distributions for flood frequency analysis", World Meteorological Organization Operational Hydrology, Report #33, WMO #718, Geneva, Switzerland.
Finkelstein, B.V., 1953. "On the limiting distributions of the extreme terms of a variational series of a two dimensional random quantity", Dokllady Akademii SSSR 91, 209-211.
Guo, S.L. 1990. " A discussion on unbiased plotting positions for the general EV distribution". Journal of hydrology, 212, 33-44.
Gringorten, I. I., 1963. "A plotting rule for extreme probability paper", Journal of Geophysical Research 68 (3), 813-814.
Gumbel, E. J., 1958. "Statistics of Extremes", Columbia University Press, New York
Gumbel, E. J., 1960. "Multivariate extremal distributions", Bulletin of the International Statistical Institute 39(2), 471-475
Kite, G. W., 1978. "Frequency and risk analysis in hydrology", Fort Collins, CO. Water resources Publications.
Rao, A. R. and Hamid, K. H., 2000. "Flood frequency analysis", Boca Raton, FL: CRC, 2000.
Raynal-Villasenor, J. A. and Salas, J.D. 1987. "Multivariate extreme value distributions in hydrologyical analyses", Water for the future: Hydrology in Perspective (Proceedings of the Rome Symposium, 1987). IAHS Publ. No. 164.
Shaw, E.M., 1991. "Hydrology in Practice, Second edition, Chapment & Hall
Shephens, M.A. 1974. "EDF statistics for goodness of fit and some comparisons", Journal of the American Statistical Association, Vol. 69, pp. 730-737.
Oliveira, J.T.D., 1958. "Extremal distributions", Rev. Fac Cienc. Lisboa Ser. 2 A, Mat., VII, 215-227.
Oliveira J.T.D., 1982. "Bivariate extremes: models and statistical decision", Technical Report #14, Center for Stochastic Processes, Department of Statistics, University of North Carolina, Chapel Hill, NC, USA.
Tardif, S., St-Hilaire, A., Roy, R., Bernier, M. and Payette, S., 2009. "Statistical properties of hydrographs in minetrophic fens and small lakes in mid-latitude Quebec, Canada", Canadian Water Resources Journal Vol. 34(4), 365-380.
Yue, S., Ouarda, T.B.M.J., Bobee, B., Legendre, P. and Bruneau, P.,1999. "The Gumbel mixed model for flood frequency analysis" Journal of Hydrology 226, 88-100.
Yue, S., Ouarda, T.B.M.J., Bobee, B., Legendre, P. and Bruneau, P.,2002. "Approach for describing statistical properties of flood hydrograph", Journal of Hydrologic Engineering. Vol. (7), Issue 2, pp. 147-153.
Yue, S., 2000. "The bivariate lognormal distribution to model a multivariate flood episode", Hydrological Processes, Vol. 14 pp. 2575-2588
Yue, S., 2001. "A bivariate extreme value distribution applied to frequency analysis", Nordic Hydrology, 32 (1), pp. 49-64.
109 Ilsley Avenue, Unit #14
Dartmouth, Nova Scotia, B3B 1S8
Toll Free: 1.877.434.3131