Step 1.2: Setup database connections
Multiple .ini files
All configuration options are stored in
.ini
files. This ensures reproducibility as all options must be saved before a script is executed and avoids the need for numerous command-line flags. Where practical, different scripts use common.ini
files, reading only those parameters that are relevant. However the options for easy import can be conceptually divided into four distinct groups, and it is convenient to keep the options for each of these groups in a separate.ini
file:
- Server/Ensembl instance configuration
- Database hosting
- Core import (genome assembly/annotation specific data)
- Compara import (multiple assembly comparative data)
This means that database connection data are repeated across multiple
.ini
files so care must be taken to use the correct template when making changes to the default settings.
To host an Ensembl mirror with remotely hosted data, at least one local database must be created with write access, to host additional data locally and to allow data import, additional users/databases must be created. These instructions assume that both the webserver and database are on localhost
. Use of separate hosts is supported (in which case this script may be run on a different host to the rest of Stage 1) but will require changes to /etc/mysql/my.cnf
to allow external connections.
cd ~/ei/em
./setup-databases.sh ../conf/setup-db.ini
Configuration options
setup-db.ini
provides the following options:
[DATABASE]
DB_USER = anonymous
DB_PASS =
DB_SESSION_USER = ensrw
DB_SESSION_PASS = ensrw
DB_IMPORT_USER = importer
DB_IMPORT_PASSWORD = importpassword
DB_ROOT_USER = root
DB_ROOT_PASSWORD = secretpassword
DB_PORT = 3306
DB_HOST = localhost
Root user connection details and user names (and passwords) for database users to be created. DB_USER
has SELECT
permissions only and will be used as the 'ro' user for the Ensembl instance. DB_SESSION_USER
has permissions on the ensembl_accounts
database and will be used as the 'rw' user for the Ensembl instance. DB_IMPORT_USER
has more extensive permissions on all databases and will be used during Core and Compara Import.
[WEBSITE]
ENSEMBL_WEBSITE_HOST = localhost
The name of the ENSEMBL_WEBSITE_HOST
host (on which Step 1.1, etc. are run) is used when setting up the database users. If this is anything other than localhost
then changes will be required to /etc/mysql/my.cnf
to support external connections.
DATA_SOURCE]
ENSEMBL_DB_URL = ftp://ftp.ensembl.org/pub/current_mysql/
ENSEMBL_DB_REPLACE =
ENSEMBL_DBS = [ ensembl_accounts ]
EG_DB_URL = ftp://ftp.ensemblgenomes.org/pub/current/pan_ensembl/mysql/
EG_DB_REPLACE = 1
EG_DBS = [ ncbi_taxonomy ensembl_website_84 ]
SPECIES_DB_URL = ftp://ftp.ensemblgenomes.org/pub/current/metazoa/mysql/
SPECIES_DB_REPLACE =
SPECIES_DB_AUTO_EXPAND =
SPECIES_DBS = [ bombyx_mori_core_31_84_1 ]
MISC_DB_URL =
MISC_DB_REPLACE =
MISC_DBS =
Locations and names of database dumps to fetch and load locally.
ENSEMBL_DB_URL
- the URL containing the Ensembl database dumpsENSEMBL_DB_REPLACE
- a flag to specify whether to overwrite databases that already exist on theDB_HOST
ENSEMBL_DBS
- a space separated list of database dump names in square braces.ensembl_accounts
is required, all others are optional- The equivalent variables may be set for
EG_DB_URL
to fetch and download EnsemblGenomes database dumps and forMISC_DB_URL
to support situations where the required databases are spread across multiple hosts. - An additional variable may be set for species databases,
SPECIES_DB_AUTO_EXPAND
- a space separated list of database types to use as replacement strings forcore
to facilitate downloading multiple database types for each species inSPECIES_DBS
Updated less than a minute ago