{"__v":11,"_id":"573af4cf17422532001d35f0","category":{"__v":0,"_id":"5735936aafab441700723a54","project":"5735936aafab441700723a50","version":"5735936aafab441700723a53","sync":{"url":"","isSync":false},"reference":false,"createdAt":"2016-05-13T08:42:18.644Z","from_sync":false,"order":0,"slug":"documentation","title":"Get Started"},"parentDoc":null,"project":"5735936aafab441700723a50","user":"573592b84b0ab120000b7d44","version":{"__v":12,"_id":"5735936aafab441700723a53","project":"5735936aafab441700723a50","createdAt":"2016-05-13T08:42:18.615Z","releaseDate":"2016-05-13T08:42:18.615Z","categories":["5735936aafab441700723a54","5735a32931a73b1700887c94","5735b55beceb872200abbc6c","5735b56eb667601700d3bd6f","5735b9ba4b0ab120000b7dd4","5735b9c94b0ab120000b7dd5","5735cb131f16241700c8a0f7","5735e5c4e4824c3400aa1f21","5735e5d9e4824c3400aa1f23","5735e5f2ec67f6290013ac72","573ecfe0804f901700a9dfc7","573f276c7eeb8b190094ca7d"],"is_deprecated":false,"is_hidden":false,"is_beta":false,"is_stable":false,"codename":"","version_clean":"1.0.0","version":"1.0"},"updates":[],"next":{"pages":[],"description":""},"createdAt":"2016-05-17T10:39:11.224Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":1,"body":"Typical genome projects generate large volumes of data that continue to be valuable well beyond the initial funding cycle.  In order to maximise this value, it is important to ensure that the data are made accessible to the widest possible community of users.  Standardisation of data formats and both programatic and user interfaces are essential to reduce the training required to access new datasets and to facilitate comparative analyses at a variety of scales.  \n\nThese considerations are central to the [Lepbase](http://lepbase.org) project.  As a taxon-oriented genomic resource for the Lepidoptera one of the core services we provide is to standardise and make accessible genome data from past and present genome projects.  By using an [Ensembl](http://ensembl.org) instance for our genome browser we are able to offer a familiar and standardised interface to each of the Lepidopteran genomes that we host.  From an archival perspective we are able store data in a format that will have long-term support with a mature database structure and codebase.  This codebase also gives us access to a powerful application programming interface (API) to facilitate large-scale comparative analysis and data-mining.\n\nSetting up an [Ensembl](http://ensembl.org) server, even to create a local mirror of existing content has long been considered non-trivial due to the number of dependencies, the complexity of the code and the interconnected configuration files which can make it difficult to trace the cause of problems during installation.\n\nOne of our earliest tasks at [Lepbase](http://lepbase.org) was to find a way to make it easy to set up an [Ensembl](http://ensembl.org) webserver so we could set up multiple instances for development and testing and move our site between virtual machines without worrying about missing dependencies. A related project, [easy mirror](https://github.com/lepbase/easy-mirror), is the result of generalising this approach to simplify setting up a mirror of any [Ensembl](http://ensembl.org) or [Ensembl Genomes](http://ensemblgenomes.org) (including Bacteria, Metazoa, Fungi, Plants and Protists) species with none, all or any amount in between of the data hosted locally.  \n\nWith the set up an [Ensembl](http://ensembl.org) mirror reduced to four simple steps, we then set about making the import of sequence, gene model and annotation data to a core database similarly straightforward in [easy import](https://github.com/lepbase/easy-import).  As we add additional datatypes to [Lepbase](http://lepbase.org), we are continuing to extend [easy import](https://github.com/lepbase/easy-import) beyond the core database.  Most recently we have added compara import, based on our own orthology pipeline and are working on variation.","excerpt":"The motivation behind easy import","slug":"background","type":"basic","title":"Background"}

Background

The motivation behind easy import

Typical genome projects generate large volumes of data that continue to be valuable well beyond the initial funding cycle. In order to maximise this value, it is important to ensure that the data are made accessible to the widest possible community of users. Standardisation of data formats and both programatic and user interfaces are essential to reduce the training required to access new datasets and to facilitate comparative analyses at a variety of scales. These considerations are central to the [Lepbase](http://lepbase.org) project. As a taxon-oriented genomic resource for the Lepidoptera one of the core services we provide is to standardise and make accessible genome data from past and present genome projects. By using an [Ensembl](http://ensembl.org) instance for our genome browser we are able to offer a familiar and standardised interface to each of the Lepidopteran genomes that we host. From an archival perspective we are able store data in a format that will have long-term support with a mature database structure and codebase. This codebase also gives us access to a powerful application programming interface (API) to facilitate large-scale comparative analysis and data-mining. Setting up an [Ensembl](http://ensembl.org) server, even to create a local mirror of existing content has long been considered non-trivial due to the number of dependencies, the complexity of the code and the interconnected configuration files which can make it difficult to trace the cause of problems during installation. One of our earliest tasks at [Lepbase](http://lepbase.org) was to find a way to make it easy to set up an [Ensembl](http://ensembl.org) webserver so we could set up multiple instances for development and testing and move our site between virtual machines without worrying about missing dependencies. A related project, [easy mirror](https://github.com/lepbase/easy-mirror), is the result of generalising this approach to simplify setting up a mirror of any [Ensembl](http://ensembl.org) or [Ensembl Genomes](http://ensemblgenomes.org) (including Bacteria, Metazoa, Fungi, Plants and Protists) species with none, all or any amount in between of the data hosted locally. With the set up an [Ensembl](http://ensembl.org) mirror reduced to four simple steps, we then set about making the import of sequence, gene model and annotation data to a core database similarly straightforward in [easy import](https://github.com/lepbase/easy-import). As we add additional datatypes to [Lepbase](http://lepbase.org), we are continuing to extend [easy import](https://github.com/lepbase/easy-import) beyond the core database. Most recently we have added compara import, based on our own orthology pipeline and are working on variation.