gsalit is a user-friendly Streamlit application that allows researchers to run the Illumina GSA (Global Screening Array) genotyping pipeline without manually handling command-line tools.
A second version, built with Python and Conda, and available via Bioconda.
The app automates:
- IDAT file management
- Reference genome selection (hg19 or hg38)
- Manifest and cluster file usage
- Running the IAAP-CLI pipeline
- Output collection and download
All required tools, references, and genome indices are pre-packaged in the Docker image, so users do not need to install anything else.
The full functional app is available at https://gsalit.serve.scilifelab.se/
- Docker installed on your machine
- Minimum 8 GB RAM (larger datasets may require more)
- Recommended: multi-core CPU for faster processing
A Docker image is also available at jd21/genlit. You can run the application using Docker with the following command:
docker run -p 8501:8501 jd21/genlit:0.2.0Then, open your web browser and navigate to http://localhost:8501 to access the Streamlit GUI.
You can install gsalit using Conda. Make sure to include the jd2112 channel when creating the environment:
conda create -n gsalit -c jd2112 -c conda-forge -c bioconda
conda activate gsalit
gsa-guiThis will create a new Conda environment named gsalit, activate it, and launch the Streamlit GUI.
Note For the first time, the conda package will download the reference genomes (hg19 and hg38) and make indexes (
.bwt), which may take 1-2 hours.
- Use the Upload IDAT files button in the sidebar.
- You can upload multiple
.idatfiles at once. - The app will store them temporarily for processing.
- Choose either hg19 or hg38 from the dropdown.
- The app will automatically select the manifest, cluster, and reference genome for the chosen build.
- Click the Run Pipeline button.
- The app will execute the GSA pipeline in a temporary workspace.
- Real-time logs will appear in the main panel.
- Logs are streamed in real-time.
- Errors or warnings will appear immediately.
- Upon successful completion, a success message will appear.
- All outputs are saved in
/app/results/run_<timestamp>/. - Certain intermediate files like
.bpm,.csv,.egt, and IDAT directories are excluded from the results folder. - A
results.zipfile is automatically created for download.
NOTE Optional Preview The app will display the last 10 lines of the VCF header if the VCF was generated successfully.
| Resource | Description | Link / Install |
|---|---|---|
| htslib | High-throughput sequencing library | htslib 1.22.1 |
| bcftools | Variant calling and manipulation tools | wget http://github.com/samtools/bcftools/releases/download/1.20/bcftools-1.20.tar.bz2 |
| samtools | Utilities for manipulating SAM/BAM files | sudo apt install samtools |
| gtc2vcf + affy2vcf | Convert IDAT/GTc to VCF | wget -P plugins http://raw.githubusercontent.com/freeseek/gtc2vcf/master/{idat2gtc.c,gtc2vcf.{c,h},affy2vcf.c,BAFregress.c} |
| IAAP-CLI | Illumina Array Analysis Platform Genotyping CLI | IAAP-CLI Manual |
| APT | Affymetrix Analysis Power Tools | APT Manual |
| bwa | Burrows-Wheeler Aligner for sequence alignment | bwa-0.7.17 |
| plink2 | Whole-genome association analysis toolset | plink2 Linux x86_64 |
| Illumina GSA Manifest Files | BPM / CSV files for array annotation | GSA Manifest & Cluster Downloads |
| Illumina GSA Cluster Files | EGT files for genotype clustering | GSA Cluster Files |
| UCSC Reference Genome hg19 | Human genome build 19 | hg19.fa |
| UCSC Reference Genome hg38 | Human genome build 38 | hg38.fa |
If you encounter issues during installation or usage, please ensure that you are using the correct channels and that your Conda environment is properly set up. You can also try using mamba for a more reliable installation process.
Developed by Jyotirmoy Das. Application deployment on SciLifeLab server managed by in collaboration with Hamza Imran
This project is licensed under the MIT License. See the LICENSE file for details
- Contributions are welcome! Please fork the repository and submit a pull request with your changes.
- Feel free to open issues for any bugs or feature requests.
- Please make sure to follow the existing code style and include tests for any new features.
Das, J. (2025). gsalit (1.0.1). Zenodo. https://doi.org/10.5281/zenodo.17671007
Special thanks to SciLifeLab Data Center (specifically Hamza) for helping with the app settings on SciLifeLab serve server.
Thanks to the open-source community for the tools and libraries that made this project possible, including Streamlit, Bioconda, Docker, gtc2vcf, and agaat.