Skip to content

mourisl/mourisl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 

Repository files navigation

Hi there 👋 I'm Li Song, an Assistant Professor in the Department of Biomedical Data Science at Dartmouth College. My research area is bioinformatics and my research interest is to design algorithms and develop software to analyze sequencing data. Our lab is actively hiring, please check lab website for more details. Here is the software developed by collaborators and me:

Immunology

  • TRUST4: TCR/BCR assembler for RNA-seq data. TRUST4 can be applied on either bulk or single-cell RNA-seq data. In addition to report CDR3s, TRUST4 also assembles full-length TCRs/BCRs. GitHub Repo stars Anaconda-Server Badge
  • T1K: Genotyper for highly polymorphic genes including KIR and HLA. T1K is verstile and works with RNA-seq, WGS and WES data. T1K also identifies novel SNPs and is compatible with single-cell RNA-seq data. GitHub Repo stars Anaconda-Server Badge

Microbiome

  • Centrifuger: Fast and memory-efficient classifier for metagenomics sequences using a lossless compressed FM-index with run-block compressed BWT. It can assign the taxonomy IDs to each sequencing read by comparing it against a database containing 34,190 prokaryotic genomes with 140 Gbp sequences using about 43 Gb memory. GitHub Repo stars Anaconda-Server Badge
  • Centrifuge: Fast and memory-efficient classifier for metagenomics sequences using an FM-index. It requires only 4.2 Gb memory for a database containing ~4300 prokaryotic genomes using lossy representations. GitHub Repo stars Anaconda-Server Badge

RNA-seq

  • CLASS/CLASS2: Efficient and accurate transcript assemblers for RNA-seq data that detect more fine-grained alternative splice variants. The programs combine linear programming algorithms to detect exons from read coverage levels, with splice graph representations of genes and their splice variants, and memory efficient optimization algorithms for transcript selection. [Also on SourceForge] GitHub Repo stars
  • PsiCLASS: Simultaneous multi-sample transcript assembler for RNA-seq data. It builds a global data structure representing the structure of the transcripts, from which each sample generates its expressed transcripts. The global information allows accurate sample-wise assemblies and final meta-assembly. GitHub Repo stars Anaconda-Server Badge
  • Rcorrector: Efficient and accurate k-mer-based error correction software for Illumina RNA-seq reads. It can also be applied to data sets where the read coverage is non-uniform, such as single-cell sequencing. GitHub Repo stars Anaconda-Server Badge
  • Rascaf: Scaffolding with RNA-seq read alignment. It uses information from paired-end and split reads to improve the completeness and contiguity of a draft genome assembly, particularly in the gene regions. GitHub Repo stars Anaconda-Server Badge

Next-generation sequencing

  • Chromap: Ultrafast alignment and preprocessing for chromatin profiling sequencing data, including ChIP-seq, ATAC-seq and Hi-C. It supports both bulk and single-cell platforms, and is more than 10 times faster than traditional workflows without sacrificing alignment accuracy. GitHub Repo stars Anaconda-Server Badge
  • Lighter: Fast and memory-efficient k-mer-based software to correct the sequencing errors from whole genome sequencing data without counting. It samples the k-mers in the data set and uses two memory-efficient Bloom filters to obtain solid k-mers. GitHub Repo stars Anaconda-Server Badge

Visualization

Python libraries to help with plotting figures
  • MSAplot: visualize multiple sequence alignment
  • pvalannot: add p-value annotation to box plots generated by Seaborn.
  • heatmapannot: add color annotation in the axes to heatmap or dot plot generated by Seaborn.
  • DotGroupPlot: group dots into different cluster shapes and support various coloring scheme, similar to honeycomb plot.

mourisl's GitHub stats

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors