`-e -j -f` Export files
(optional)
The process of import and export generates a set of files that will be consistent across different assemblies and therefore useful for use in comparative analyses, converting to BLAST databases, etc. Exported .json
files that can be used for assembly/annotation statistic visualisation.
docker run --rm \
--name easy-import-operophtera_brumata_v1_core_32_85_1 \
--link genomehubs-mysql \
-v ~/demo/genomehubs-import/import/conf:/import/conf \
-v ~/demo/genomehubs-import/import/data:/import/data \
-v ~/demo/genomehubs-import/download/data:/import/download \
-v ~/demo/genomehubs-import/blast/data:/import/blast \
-e DATABASE=operophtera_brumata_v1_core_32_85_1 \
-e FLAGS="-e" \
genomehubs/easy-import:latest
export_json.pl
generates three .json
files:
<assembly_name>.meta.json
contains basic metadata for the assembly and basic summary statistics including assembly span and number of gene models.<assembly_name>.assembly-stats.json
contains an assembly summary in the format used by github.com/rjchallis/assembly_stats to produce a number of individual and comparative views of several assembly statistics.<assembly_name>.codon-usage.json
contains a summary of scaffold, gene, exon, etc. lengths, base composition and codon usage in the format used by github.com/rjchallis/codon_usage to visualise expected and observed codon usage patterns.
docker run --rm \
--name easy-import-operophtera_brumata_v1_core_32_85_1 \
--link genomehubs-mysql \
-v ~/demo/genomehubs-import/import/conf:/import/conf \
-v ~/demo/genomehubs-import/import/data:/import/data \
-v ~/demo/genomehubs-import/download/data:/import/download \
-e DATABASE=operophtera_brumata_v1_core_32_85_1 \
-e FLAGS="-j" \
genomehubs/easy-import:latest
Gene models and annotated scaffolds may also be exported in GFF3/EMBL format using `export_features.pl. This script will always export GFF3 and will additionally export EMBL format ready for submission to the INSDC if the required fields (see below) are specified in
[META]`. The resulting file should be validated using the ENA flat file validator to confirm the output is valid prior to submission.
docker run --rm \
--name easy-import-operophtera_brumata_v1_core_32_85_1 \
--link genomehubs-mysql \
-v ~/demo/genomehubs-import/import/conf:/import/conf \
-v ~/demo/genomehubs-import/import/data:/import/data \
-v ~/demo/genomehubs-import/download/data:/import/download \
-e DATABASE=operophtera_brumata_v1_core_32_85_1 \
-e FLAGS="-f" \
genomehubs/easy-import:latest
Configuration options
Additional entries are required in [META]
in order to export to EMBL format:
[META]
ASSEMBLY.BIOPROJECT=PRJEB00000
ASSEMBLY.LOCUS_TAG=ABC123
SPECIES.EMBL_DIVISION=INV
Where the bioproject and locus tag must be registered during the submission registration process and the embl division corresponds to the available taxonomic divisions in the EMBL format documentation
Updated less than a minute ago