The 2016–2018 NOAA Hazardous Weather Testbed (HWT) Spring Forecasting Experiments (SFE) featured the Community Leveraged Unified Ensemble (CLUE), a coordinated convection-allowing model (CAM) ensemble framework designed to provide empirical guidance for development of operational CAM systems. The 2017 CLUE included 81 members that all used 3-km horizontal grid spacing over the CONUS, enabling direct comparison of forecasts generated using different dynamical cores, physics schemes, and initialization procedures. This study uses forecasts from several of the 2017 CLUE members and one operational model to evaluate and compare CAM representation and next-day prediction of thunderstorms. The analysis utilizes existing techniques and novel, object-based techniques that distill important information about modeled and observed storms from many cases. The National Severe Storms Laboratory Multi-Radar/Multi-Sensor product suite is used to verify model forecasts and climatologies of observed variables. Unobserved model fields are also examined to further illuminate important inter-model differences in storms and near-storm environments. No single model performed better than the others in all respects. However, there were many systematic inter-model and inter-core differences in specific forecast metrics and model fields. Some of these differences can be confidently attributed to particular differences in model design. Model intercomparison studies similar to the one presented here are important to better understand the impacts of model and ensemble configurations on storm forecasts and to help optimize future operational CAM systems.