{"__v":5,"_id":"5735f94b1d2b0032000a5d28","category":{"project":"5735936aafab441700723a50","version":"5735936aafab441700723a53","_id":"5735e5d9e4824c3400aa1f23","__v":0,"sync":{"url":"","isSync":false},"reference":false,"createdAt":"2016-05-13T14:34:01.858Z","from_sync":false,"order":9,"slug":"configuration-options-core-import","title":"Configuration Options (Core Import)"},"parentDoc":null,"project":"5735936aafab441700723a50","user":"573592b84b0ab120000b7d44","version":{"__v":12,"_id":"5735936aafab441700723a53","project":"5735936aafab441700723a50","createdAt":"2016-05-13T08:42:18.615Z","releaseDate":"2016-05-13T08:42:18.615Z","categories":["5735936aafab441700723a54","5735a32931a73b1700887c94","5735b55beceb872200abbc6c","5735b56eb667601700d3bd6f","5735b9ba4b0ab120000b7dd4","5735b9c94b0ab120000b7dd5","5735cb131f16241700c8a0f7","5735e5c4e4824c3400aa1f21","5735e5d9e4824c3400aa1f23","5735e5f2ec67f6290013ac72","573ecfe0804f901700a9dfc7","573f276c7eeb8b190094ca7d"],"is_deprecated":false,"is_hidden":false,"is_beta":false,"is_stable":false,"codename":"","version_clean":"1.0.0","version":"1.0"},"updates":[],"next":{"pages":[],"description":""},"createdAt":"2016-05-13T15:56:59.937Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":7,"body":"```\n[FILES]\n  SCAFFOLD = [ fa http://www.bioinformatics.nl/wintermoth/data_files/Obru1.fsa.gz ]\n  GFF = [ gff3 http://www.bioinformatics.nl/wintermoth/data_files/Obru_genes.gff.gz ]\n  PROTEIN = [ fa http://www.bioinformatics.nl/wintermoth/data_files/ObruPep.fasta.gz ]\n```\n\nEach line in the ``[FILES]`` stanza assigns a space-separated array of values to a particular key, which can then be referenced elsewhere in the ``.ini`` file. Some key names are optional and/or arbitrary, while others, such as ``GFF`` and ``SCAFFOLD`` are referenced in the import scripts and must be specified with the expected file type.\n\n```\n  FILE = [ type remote_location local_name ]\n```\n\nWithin the values array:\n- The first position specifies the file type and may be ``fa``, ``agp``, ``gff3`` or ``txt``. \n- The second position is the path to the file, and may be of the form ``http://example.com/filename``, ``ftp://ftp.exampl.com/filename`` or ``/path/to/filename``.  Files will also be retrieved by scp from locations matching the pattern ``server:/path/to/filename`` in which case it is best to use ``ssh_config`` to manage login to the remote server.\n- The third position is optional and should be a local name to use for the downloaded file, which may be used to ensure files downloaded from different sources are named consistently.\n\n```\n[FILES]\n  SCAFFOLD = [ fa http://www.bioinformatics.nl/wintermoth/data_files/Obru1.fsa.gz ]\n```\nDetails of the sequence file(s) to be imported.  \n- If a ``SCAFFOLD`` file of type ``fa`` is provided, then a ``CONTIG`` file is optional and *vice versa*. \n- ``SCAFFOLD`` data can also be imported from an ``agp`` file provided ``CONTIG`` sequences are provided.  \n- If no ``CONTIG`` file is provided, contigs will be imputed from runs of ``N`` in the ``SCAFFOLD`` sequence\n\n```\n[FILES]\n  BLASTP =  [ BLASTP  http://download.lepbase.org/current/blastp/Operophtera_brumata_v1_-_proteins.fa.blastp.uniprot_sprot.1e-10.gz ]\n  IPRSCAN = [ IPRSCAN http://download.lepbase.org/current/interproscan/Operophtera_brumata_v1_-_proteins.fa.interproscan.gz ]\n  REPEATMASKER = [ REPEATMASKER http://download.lepbase.org/current/repeats/Operophtera_brumata_v1_-_scaffolds.fa.out.gz ]\n```\nSpecifiy the (remote) locations of ``BLASTP``, ``IPRSCAN`` and ``REPEATMASKER`` files as appropriate.","excerpt":"","slug":"files-core","type":"basic","title":"[FILES]"}
``` [FILES] SCAFFOLD = [ fa http://www.bioinformatics.nl/wintermoth/data_files/Obru1.fsa.gz ] GFF = [ gff3 http://www.bioinformatics.nl/wintermoth/data_files/Obru_genes.gff.gz ] PROTEIN = [ fa http://www.bioinformatics.nl/wintermoth/data_files/ObruPep.fasta.gz ] ``` Each line in the ``[FILES]`` stanza assigns a space-separated array of values to a particular key, which can then be referenced elsewhere in the ``.ini`` file. Some key names are optional and/or arbitrary, while others, such as ``GFF`` and ``SCAFFOLD`` are referenced in the import scripts and must be specified with the expected file type. ``` FILE = [ type remote_location local_name ] ``` Within the values array: - The first position specifies the file type and may be ``fa``, ``agp``, ``gff3`` or ``txt``. - The second position is the path to the file, and may be of the form ``http://example.com/filename``, ``ftp://ftp.exampl.com/filename`` or ``/path/to/filename``. Files will also be retrieved by scp from locations matching the pattern ``server:/path/to/filename`` in which case it is best to use ``ssh_config`` to manage login to the remote server. - The third position is optional and should be a local name to use for the downloaded file, which may be used to ensure files downloaded from different sources are named consistently. ``` [FILES] SCAFFOLD = [ fa http://www.bioinformatics.nl/wintermoth/data_files/Obru1.fsa.gz ] ``` Details of the sequence file(s) to be imported. - If a ``SCAFFOLD`` file of type ``fa`` is provided, then a ``CONTIG`` file is optional and *vice versa*. - ``SCAFFOLD`` data can also be imported from an ``agp`` file provided ``CONTIG`` sequences are provided. - If no ``CONTIG`` file is provided, contigs will be imputed from runs of ``N`` in the ``SCAFFOLD`` sequence ``` [FILES] BLASTP = [ BLASTP http://download.lepbase.org/current/blastp/Operophtera_brumata_v1_-_proteins.fa.blastp.uniprot_sprot.1e-10.gz ] IPRSCAN = [ IPRSCAN http://download.lepbase.org/current/interproscan/Operophtera_brumata_v1_-_proteins.fa.interproscan.gz ] REPEATMASKER = [ REPEATMASKER http://download.lepbase.org/current/repeats/Operophtera_brumata_v1_-_scaffolds.fa.out.gz ] ``` Specifiy the (remote) locations of ``BLASTP``, ``IPRSCAN`` and ``REPEATMASKER`` files as appropriate.