Evaluation of the Causal Relationship between Variables Using a Probabilistic Approach for Water Quality Management

Jianxun He


In aquatic environments, complex interplay exists among physical, chemical, and biological water quality parameters, which are further influenced by exogenous factors such as hydrological, meteorological and geological conditions. To understand the spatial and temporal variations of water quality, and furthermore, the relationships between the variables of interest is hence a challenging task. Given the large data matrix, one category of methods frequently employed in the literature is multivariate analysis such as cluster analysis, principal component analysis, and factor analysis. These techniques are straightforward and intuitive to identify the qualitative associations among variables. However, a quantitative evaluation from a probabilistic perspective is favorable since it defines a measurable causality among variables so that more efficient water management strategies can be formulated. This paper will illustrate a new way to discover the relationship between two variables by estimating their joint distribution which fully interprets the statistical dependence. A multivariate Gaussian mixture model was employed to describe the data. The model parameters were determined using the developed expectation maximization algorithm, which is capable of dealing with multiple variables and censored data. The joint distribution of two variables of interest and the conditional distributions were used to describe the complete statistical distribution of water quality parameters, which are subject to the effects of hydro-meteorological conditions. The method was demonstrated using data from the Bow River in Alberta. The results shed light on how one variable affects the distribution of the other variable under complex environments in a probabilistic context.

Full Text:



  • There are currently no refbacks.