Step 2.7: Export files

(optional)

The process of import and export generates a set of files that will be consistent across different assemblies and therefore useful for use in comparative analyses, converting to BLAST databases, etc. Exported .json files that can be used for assembly/annotation statistic visualisation.

export_sequences.pl generates a scaffold level assembly sequence file, together with protein and CDS files for use in Stage 4 as part of the Compara import. Files are written to a subdirectory named exported

cd ~/import
perl ../ei/core/export_sequences.pl ../ei/conf/core-import.ini
cd ~/import
perl ../ei/core/export_json.pl ../ei/conf/core-import.ini

export_json.pl generates three .json files:

  • <assembly_name>.meta.json contains basic metadata for the assembly and basic summary statistics including assembly span and number of gene models.
  • <assembly_name>.assembly-stats.json contains an assembly summary in the format used by github.com/rjchallis/assembly_stats to produce a number of individual and comparative views of several assembly statistics.
  • <assembly_name>.codon-usage.json contains a summary of scaffold, gene, exon, etc. lengths, base composition and codon usage in the format used by github.com/rjchallis/codon_usage to visualise expected and observed codon usage patterns.

These files are written to the web subdirectory.