How are the SIRV isoforms designed?
The SIRV isoform design is based on 7 human model genes. The annotated transcripts of these genes were extended by additional isoforms and variants to comprehensively cover alternative splicing, start- and end-site variations, antisense and overlapping transcripts. The exonic sequences of the resulting 69 transcript structures (6-18 per gene) were derived from database-derived genomes, which were then altered to completely lose alignment identity, and blasted against the NCBI database on the nucleotide and the protein level to ensure they are non-identical. Intronic sequences were generated randomly.
The SIRV sequences conform to the canonical exon-intron junction rule: 96.9% of all SIRV junctions are GT-AG, with the less frequent variants being present at 1.7% (GC-AG) and 0.6% (AT-AC). Two non-canonical splice sites were included at 0.4% each (CT-AG and CT-AC).