Can I add new genome references for my data analysis with Kangooroo?
Yes.
There are two possible options:
We can add a new genome reference to the database for you. Please note this will come with a fee. For more information, please contact sales@lexogen.com.
You can upload your genome reference of interest directly on the platform. To do so, please follow the guidelines below.
1 - Download the genome and annotation references from your species of interest from Ensembl (https://ftp.ensembl.org/).
Select the directory ´´pub/´´.
Select the release folder of your choice (e.g., release-110).
Select ´´gtf/´´ folder or ´´dna/´´ folder to download annotation or genome files, respectively.
Select your organism of interest.
Download the gtf.gz file for the annotation reference.
Download the toplevel.fa.gz file for the genome reference.
2 - Rename gtf and fasta files as follow:
gtf file - annotation_organism_ercc_sirv_biotyped.gtf.
Example: Mus_musculus.GRCm39.110.gtf.gz renamed to annotation_organism_ercc_sirv_biotyped.gtf
fasta file - annotation_organism_ercc_sirv.fa.
Example: Mus_musculus.GRCm39.dna.toplevel.fa.gz renamed to annotation_organism_ercc_sirv.fa
3 - Generate a final directory containing both genome and annotation files.
First, copy the files:
cp annotation_organism_ercc_sirv.fa annotation_organism_ercc_sirv_biotyped.gtf ./Directory_folder/
Then, compress the resulting folder (tar.gz):
tar czfv Directory_folder.tar.gz Directory_folder
4 - Upload the directory folder in Kangooroo as described in the following FAQ: How do I upload my files in the Kangooroo platform?
5 - Tag the reference as a genome file.
By default, all uploaded files have a designated type marked as “File”. To be used as a reference by the pipeline, the type of the uploaded file must be changed to ´´Genome´´. Please watch our tutorial video on how to tag your file as a ´´Genome´´ type file here.