The statistical analyses by this tool have been developed by the Database Team to provide screening level summaries of BMP performance based on water quality concentrations. Analyses based on loads and/or volumes are not included in this tool, but are important considerations in evaluating the performance of BMP types that provide significant volume reduction.

To minimize errors and maximize computational efficiency, the tool uses a set of statistical methods that can be implemented efficiently for a broad range of influent and effluent data sets with varying size and complexity. In addition, the data sets have been initially screened by the Database Team for appropriateness for this type of analysis. The statistical summaries provided here may, in some cases, differ from BMP performance summaries resulting from more detailed analyses conducted by the Database Team.

The statistical summaries provided in this tool are intended to provide a general summary of BMP performance for the set of BMPs selected by the user. While reasonable effort has been made to generate representative summaries of BMP performance, use of these analysis results is solely at the risk and option of the user.

Summary of Statistical Methods Used in Analysis

The statistical methods used in developing these summaries are described below. A subset of the results of methods have been verified with other statistical packages such as SciPy[1] and R[2].

Basic Statistics

The median and interquartile range values are presented to provide a non‐parametric description of the central tendency of the data set. An advantage of non-parametric statistics is that they do not require assumptions about the distribution of the underlying data.

The mean and standard deviation are also presented. Simple substitution of one-half of the detection limit values has been used for non‐detects[3]. The percent of non‐detects in a given data set provides some insight into the potential bias introduced by this substitution. The percentages of influent and effluent non-detect results should be reviewed before drawing conclusions regarding the validity of the statistics. This is particularly the case for parameter groups such as dissolved metals where non-detect results are most prevalent.

Hypothesis Testing

Results from the Mann‐Whitney and Wilcoxon tests provide information about the statistical significance of the difference between the influent and effluent distributions. The Mann-Whitney test applies to independent data sets, whereas the Wilcoxon test applies to the paired values. These tests are evaluated at the 0.05 and 0.10 significance levels with the null hypotheses stating that "the influent and effluent data are sampled from the same distribution." The Welch's t-test (unequal variance) provides comparable information on the statistical significance of the difference in the influent and effluent mean concentrations. The null hypothesis may be rejected for p‐values less than the indicated significance level.

In some cases, the Mann-Whitney and Wilcoxon hypothesis test results produce conflicting conclusions regarding statistically significant differences. Such cases are more likely to occur where there are imbalances in the number of influent and effluent samples for a particular data set because the Mann-Whitney test operates on the entire data set whereas the Wilcoxon test only operates on data pairs. For BMPs with long residence times and/or permanent pools (e.g., wet ponds), the paired storm event hypothesis test results relying on the Wilcoxon test may be less representative than the Mann-Whitney test because of variations in sampling program designs for collection of influent and effluent samples that may not enable event-based pairing of monitoring data. For example, influent for a storm event on a particular date may mix with water from a previous event that has been stored since the previous storm. Thus, in cases where the Mann-Whitney and Wilcoxon test results conflict for BMPs with permanent pools, the Mann-Whitney results may provide a better indicator of pollutant removal performance.

Box‐and‐Whisker Plot

Box plots (or box and whisker plots) provide a schematic representation of the central tendency and spread of the influent and effluent data sets. For each set of analysis results, the influent box plots are provided on the left and the effluent box plots are provided on the right. A key to the box plots is provided below.

Lognormal Quantile Plots

Quantile plots illustrate the empirical distribution of the data. A comparison of the influent and effluent probability plots shows differences among all quantiles (not just the median) and whether the influent and effluent data sets are similarly distributed. Although the influent and effluent concentrations in a quantile plot are not paired values, the relative position and slope of the two populations can indicate the effectiveness of the BMP. The linearity of the series on these plots also provide an indication of whether or not each is well‐fit to a lognormal distribution.

The plots presented in these analysis results are developed in a manner modeled after the open‐source R statistical package. The quantiles are computed using functions based on Wichura (1988)[4].

Paired Scatter Plots

Influent vs. effluent scatterplots depict paired data to provide an indication of how effluent concentrations may be related to the influent concentrations. Data points below the 45 degree line indicate removals whereas data points above the 45 degree line indicate increases. A diamond symbol is used if both the influent and effluent are non-detect. If only the influent or effluent is non-detect, then a triangle symbol pointing downward or to the left, respectively, is used. Because these plots require sample concentrations for both influent and effluent, performance may be under-represented for facilities that discharge infrequently such as bioretention facilities or other infiltrating BMPs (i.e., don’t have effluent samples to pair with influent)

Time Series Plot

The time series plot presented in the statistical summaries is a simple scatter plot showing the influent and effluent concentrations collected on a given date. In most cases, paired data are available that have been collected from the same BMP for the same storm event; however, this is not the case for all studies.

Screening Criteria for Statistical Analysis Data Sets

Not all water quality records included in the BMP database are included in the data sets used by this statistical tool. An initial screening has been conducted by the Database Team to identify water quality data that are reasonably appropriate for analysis. Records that pass this initial screening are identified by:

These records are identified in the starter query 'bWQ BMP FlatFile BMP Indiv Anal' included in the BMP Database available for download.

Some additional screening criteria have been applied to the resulting data set so that the data used in the statistical analysis tool are internally consistent and reasonably appropriate for comparison of influent and effluent concentrations. These screening criteria are described below.


1SciPy is a Python-based ecosystem of open-source software for mathematics, science, and engineering. More information can be found at

2R is a free software environment for statistical computing and graphics. More information can be found at

3Note that other, more specific analyses conducted by the Database Team have used more advanced approaches for dealing with non‐detects, which may lead to different results. A simpler method was selected for this analysis to provide a more general tool for use with a variety of data sets.

4Wichura, M. J. (1988) Algorithm AS 241: The Percentage Points of the Normal Distribution. Applied Statistics, 37, 477–484.