Code release for the paper "GPTCast: a weather language model for precipitation nowcasting"
@Article{gmd-18-5351-2025,
AUTHOR = {Franch, G. and Tomasi, E. and Wanjari, R. and Poli, V. and Cardinali, C. and Alberoni, P. P. and Cristoforetti, M.},
TITLE = {GPTCast: a weather language model for precipitation nowcasting},
JOURNAL = {Geoscientific Model Development},
VOLUME = {18},
YEAR = {2025},
NUMBER = {16},
PAGES = {5351--5371},
URL = {https://gmd.copernicus.org/articles/18/5351/2025/},
DOI = {10.5194/gmd-18-5351-2025}
}
paper: https://gmd.copernicus.org/articles/18/5351/2025/
data: https://doi.org/10.5281/zenodo.13692016
models: https://doi.org/10.5281/zenodo.13594332
Install dependencies
# install python3.12 on ubuntu
bash install_python_ubuntu.sh
# create environment with poetry
bash create_environment.sh
# activate the environment
source .venv/bin/activate Check the notebooks in the notebooks folder on how to use the pretrained models.
-
See the notebook notebooks/example_gptcast_forecast.ipynb for running the models on a test batch and generating a forecast.
-
See the notebook notebooks/example_autoencoder_reconstruction.ipynb for a test on the VAE reconstruction.
To train the model on the original dataset, first run the script in the data folder to download the dataset.
# download the dataset
python data/download_data.pyTrain the first stage (the VAE) with one of the following configurations contained in the folder configs/experiment/:
- vaeganvq_mae - Mean Absolute Error loss
- vaeganvq_mwae - Magnitude Weighted Absolute Error loss
# train a VAE with WMAE reconstruction loss on GPU
# the result (including model checkpoints) will be saved in the folder `logs/train/`
python gptcast/train.py trainer=gpu experiment=vaeganvq_mwae.yaml After training the VAE, train the GPTCast model with one of the following configurations contained in the folder configs/experiment/:
- gptcast_8x8 - 8x8 token spatial context (128x128 pixels)
- gptcast_16x16 - 16x16 token spatial context (256x256 pixels)
# train GPTCast with a 16x16 token spatial context on GPU
# the result (including model checkpoints) will be saved in the folder `logs/train/`
# the VAE checkpoint path should be provided
python gptcast/train.py trainer=gpu experiment=gptcast_16x16.yaml model.first_stage.ckpt_path=<path_to_vae_checkpoint>