In the current study, a technique that offers a way to evaluate ensemble forecast uncertainties produced either by initial conditions or different model versions, or both, is presented. The technique consists of first diagnosing the performance of the forecast ensemble and then optimizing the ensemble forecast using results of the diagnosis. The technique is based on the explicit evaluation of probabilities that are associated with the Gaussian stochastic representation of the weather analysis and forecast. It combines an ensemble technique for evaluating the analysis error covariance and the standard Monte Carlo approach for computing samples from a known Gaussian distribution. The technique was demonstrated in a tutorial manner on two relatively simple examples to illustrate the impact of ensemble characteristics including ensemble size, various observation strategies, and configurations including different model versions and varying initial conditions. In addition, the authors assessed improvements in the consensus forecasts gained by optimal weighting of the ensemble members based on time-varying, prior-probabilistic skill measures. The results with different observation configurations indicate that, as observations become denser, there is a need for larger-sized ensembles and/or more accuracy among individual members for the ensemble forecast to exhibit prediction skill. The main conclusions relative to ensembles built up with different physics configurations were, first, that almost all members typically exhibited some skill at some point in the model run, suggesting that all should be retained to acquire the best consensus forecast; and, second, that the normalized probability metric can be used to determine what sets of weights or physics configurations are performing best. A comparison of forecasts derived from a simple ensemble mean to forecasts from a mean developed from variably weighting the ensemble members based on prior performance by the probabilistic measure showed that the latter had substantially reduced mean absolute error. The study also indicates that a weighting scheme that utilized more prior cycles showed additional reduction in forecast error.