Building thermal load prediction informs the optimization of cooling plant and thermal energy storage. Physics-based prediction models of building thermal load are constrained by the model and input complexity. In this study, we developed 12 data-driven models (7 shallow learning, 2 deep learning, and 3 heuristic methods) to predict building thermal load and compared shallow machine learning and deep learning. The 12 prediction models were compared with the measured cooling demand. It was found XGBoost (Extreme Gradient Boost) and LSTM (Long Short Term Memory) provided the most accurate load prediction in the shallow and deep learning category, and both outperformed the best baseline model, which uses the previous day’s data for prediction. Then, we discussed how the prediction horizon and input uncertainty would influence the load prediction accuracy. Major conclusions are twofold: first, LSTM performs well in short-term prediction (1 h ahead) but not in long term prediction (24 h ahead), because the sequential information becomes less relevant and accordingly not so useful when the prediction horizon is long. Second, the presence of weather forecast uncertainty deteriorates XGBoost’s accuracy and favors LSTM, because the sequential information makes the model more robust to input uncertainty. Training the model with the uncertain rather than accurate weather data could enhance the model’s robustness. Our findings have two implications for practice. First, LSTM is recommended for short-term load prediction given that weather forecast uncertainty is unavoidable. Second, XGBoost is recommended for long term prediction, and the model should be trained with the presence of input uncertainty.