GigaSOM.jl - Huge-scale, high-performance flow cytometry clustering in Julia
GigaSOM.jl allows painless analysis of huge-scale clinical studies, scaling down the software limitations that usually prevent work with large datasets. It can be viewed as a work-alike of FlowSOM, suitable for loading billions of cells and running the analyses in parallel on distributed computer clusters, to gain speed. Most importantly, GigaSOM.jl scales horizontally – data volume limitations and memory limitations can be solved just by adding more computers to the cluster. That makes it extremely easy to exploit HPC environments, which are becoming increasingly common in computational biology.
Evolution of the GigaSOM.jl repository (2019-2020)
Features
- Horizontal scalability to literal giga-scale datasets ($10^9$ cells!)
- HPC-ready, support for e.g. Slurm
- Standard support for distributed loading, scaling and transforming the FCS3 files
- Batch-SOM based GigaSOM algorithm for clustering
- EmbedSOM for visualizations
Background
You can learn more about the background of GigaSOM.jl in these sections:
How to get started?
You can follow our extensive tutorials here:
- Tutorial 1: Intro & basic usage
- Tutorial 2: Working with cytometry data
- Tutorial 3: Distributed data processing and statistics
- Where to continue after finishing the tutorials?
Functions
A full reference to all functions is given here:
How to contribute?
If you want to contribute, please read these guidelines first: