High-resolution (3 km) time-lagged (initialized every 3 h) multimodel ensembles were produced in support of the Hydrometeorological Testbed (HMT)-West-2006 campaign in northern California, covering the American River basin (ARB). Multiple mesoscale models were used, including the Weather Research and Forecasting (WRF) model, Regional Atmospheric Modeling System (RAMS), and fifth-generation Pennsylvania State University–National Center for Atmospheric Research Mesoscale Model (MM5). Shortrange (6 h) quantitative precipitation forecasts (QPFs) and probabilistic QPFs (PQPFs) were compared to the 4-km NCEP stage IV precipitation analyses for archived intensive operation periods (IOPs). The two sets of ensemble runs (operational and rerun forecasts) were examined to evaluate the quality of highresolution QPFs produced by time-lagged multimodel ensembles and to investigate the impacts of ensemble configurations on forecast skill. Uncertainties in precipitation forecasts were associated with different models, model physics, and initial and boundary conditions. The diabatic initialization by the Local Analysis and Prediction System (LAPS) helped precipitation forecasts, while the selection of microphysics was critical in ensemble design. Probability biases in the ensemble products were addressed by calibrating PQPFs. Using artificial neural network (ANN) and linear regression (LR) methods, the bias correction of PQPFs and a cross-validation procedure were applied to three operational IOPs and four rerun IOPs. Both the ANN and LR methods effectively improved PQPFs, especially for lower thresholds. The LR method outperformed the ANN method in bias correction, in particular for a smaller training data size. More training data (e.g., one-season forecasts) are desirable to test the robustness of both calibration methods.