[_DESCRIPTIONS]
[GENE_DESCRIPTIONS]
GFF = [ 1 DAUGHTER->product /(.+)/ ]
[TRANSCRIPT_DESCRIPTIONS]
GFF = [ 1 SELF->product /(.+)/ ]
Descriptions are displayed in the Ensembl website and included in the search index (optional Step 2.8). Each set of descriptions may be sourced from any number of files, in which case the first number in the value array indicates the priority accorded to descriptions from that source. Descriptions from sources with lower numbers will overwrite those from sources with higher numbers. If set to 1 this priority flag will also cause any existing descriptions in the database for the current gene/transcript to be overwritten.
_DESCRIPTIONS
from files other than .gff
must be linked to the correct feature by [_STABLE_IDS] (where linking files by _STABLE_IDS
is described in more detail):
[FILES]
GFF = [ gff http://example.com/gene_models.gff3.gz ]
PROTEIN = [ fa http://example.com/proteins.fa.gz ]
ANNOTATION = [ tsv http://example.com/annotations.txt.gz ]
[GENE_STABLE_IDS]
GFF = [ gene->Name /(.+)/ ]
PROTEIN = [ DISPLAY_ID /(.+)-PA/ ]
ANNOTATION = [ FIELD_1 /(.+)/ ]
[GENE_DESCRIPTIONS]
GFF = [ 1 DAUGHTER->product /(.+)/ ]
PROTEIN = [ 2 DESCRIPTION /(.+)/ ]
ANNOTATION = [ 3 FIELD_2 /(.+)/ ]
- the
ANNOTATION
file has the lowest priority (3
) and descriptions fromFIELD_2
in this file will only be used if no corresponding description is found in either of the other files - the
PROTEIN
file has priority2
so descriptions from the second part of thefasta
headers will be used in preference to descriptions in theANNOTATION
file unless a description exists in theGFF
file - the
GFF
file has priority1
so descriptions from this file will be used in preference to those from the other files and if a description already exists for this gene in the database, it will be overwritten by this new value. - reducing all the priorities by 1 would retain the same behaviour with the exception that existing descriptions in the database would not be overwritten.
Updated less than a minute ago