Skip to content

makeTxDbFromGRanges dropping a lot of transcripts #12

@Rohit-Satyam

Description

@Rohit-Satyam

Hi @hpages

I have stumbled upon a peculiar case where the makeTxDbFromGRanges function is dropping a lot of transcripts

## Deifne release of annotation
x=49

## Import GFF
  txdb<-paste0("https://toxodb.org/common/downloads/release-",x,"/TgondiiME49/gff/data/ToxoDB-",x,"_TgondiiME49.gff") %>% txdbmaker::makeTxDbFromGFF(format = "gff3")

Warning message:
In makeTxDbFromGRanges(gr, metadata = metadata) :
  The following transcripts were dropped because their exon ranks could
  not be inferred (either because the exons are not on the same
  chromosome/strand or because they are not separated by introns):
  TGME49_200010-t26_1, TGME49_200110-t26_1, TGME49_200130-t26_1,
  TGME49_200230-t26_1, TGME49_200240-t26_1, TGME49_200250-t26_1,
  TGME49_200260-t26_1, TGME49_200270-t26_1, TGME49_200280-t26_1,
  TGME49_200290-t26_1, TGME49_200295-t26_1, TGME49_200300-t26_1,
  TGME49_200310-t26_1, TGME49_200320-t26_1, TGME49_200330-t26_1,
  TGME49_200340-t26_1, TGME49_200350-t26_1, TGME49_200360-t26_1,
  TGME49_200370-t26_1, TGME49_200375-t26_1, TGME49_200385-t26_1,
  TGME49_200400-t26_1, TGME49_200410-t26_1, TGME49_200430-t26_1,
  TGME49_200440-t26_1, TGME49_200450-t26_1, TGME49_200460-t26_1,
  TGME49_200470-t26_1, TGME49_200480-t26_1, TGME49_200590-t26_1,
  TGME49_200595-t26_1, TGME49_200600-t26_1, TGME49_200700-t26_1,
  TGME49_201100-t26_1, TGME49_201110-t26_1, TGME49_201120-t26_1,
  TGME49_201130-t26 [... truncated]

Only the release 49 annotation is problematic for this parasite. Having a look at the GFF file, I am still unable to understand why the makeTxDbFromGRanges will discard TGME49_200010-t26_1 in release 49 but will have no issue with TGME49_200010.R447 in release 68? is it because mRNA field is missing from the GFF?

## Release 49
TGME49_chrXII	VEuPathDB	gene	2245476	2249210	.	-	.	ID=TGME49_200010;description=hypothetical protein
TGME49_chrXII	VEuPathDB	protein_coding_gene	2245476	2249210	.	-	.	ID=TGME49_200010-t26_1;Parent=TGME49_200010;description=hypothetical protein
TGME49_chrXII	VEuPathDB	exon	2245476	2247209	.	-	.	ID=exon_TGME49_200010-t26_1-E2;Parent=TGME49_200010-t26_1
TGME49_chrXII	VEuPathDB	exon	2247554	2249210	.	-	.	ID=exon_TGME49_200010-t26_1-E1;Parent=TGME49_200010-t26_1
TGME49_chrXII	VEuPathDB	CDS	2246111	2247209	.	-	1	ID=TGME49_200010-t26_1-p1-CDS2;Parent=TGME49_200010-t26_1;protein_source_id=TGME49_200010-t26_1-p1
TGME49_chrXII	VEuPathDB	CDS	2247554	2247696	.	-	0	ID=TGME49_200010-t26_1-p1-CDS1;Parent=TGME49_200010-t26_1;protein_source_id=TGME49_200010-t26_1-p1
TGME49_chrXII	VEuPathDB	three_prime_UTR	2245476	2246110	.	-	.	ID=utr_TGME49_200010-t26_1_1;Parent=TGME49_200010-t26_1
TGME49_chrXII	VEuPathDB	five_prime_UTR	2247697	2249210	.	-	.	ID=utr_TGME49_200010-t26_1_2;Parent=TGME49_200010-t26_1

## Release 68
TGME49_chrXII	VEuPathDB	protein_coding_gene	2245476	2248187	.	-	.	ID=TGME49_200010;Name=GRA20;description=dense granule protein GRA20;ebi_biotype=protein_coding
TGME49_chrXII	VEuPathDB	mRNA	2245476	2248187	.	-	.	ID=TGME49_200010.R447;Parent=TGME49_200010;description=dense granule protein GRA20;gene_ebi_biotype=protein_coding
TGME49_chrXII	VEuPathDB	exon	2245476	2247209	.	-	.	ID=exon_TGME49_200010.R447-E2;Parent=TGME49_200010.R447;gene_id=TGME49_200010
TGME49_chrXII	VEuPathDB	exon	2247554	2248187	.	-	.	ID=exon_TGME49_200010.R447-E1;Parent=TGME49_200010.R447;gene_id=TGME49_200010
TGME49_chrXII	VEuPathDB	CDS	2246111	2247209	.	-	1	ID=TGME49_200010.P447-CDS2;Parent=TGME49_200010.R447;gene_id=TGME49_200010;protein_source_id=TGME49_200010.P447
TGME49_chrXII	VEuPathDB	CDS	2247554	2247696	.	-	0	ID=TGME49_200010.P447-CDS1;Parent=TGME49_200010.R447;gene_id=TGME49_200010;protein_source_id=TGME49_200010.P447
TGME49_chrXII	VEuPathDB	three_prime_UTR	2245476	2246110	.	-	.	ID=utr_TGME49_200010.R447_1;Parent=TGME49_200010.R447
TGME49_chrXII	VEuPathDB	five_prime_UTR	2247697	2248187	.	-	.	ID=utr_TGME49_200010.R447_2;Parent=TGME49_200010.R447

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions