You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

432 lines
13 KiB

create a datavault-storage abstraction system "data-as-a-service"
start-simple: "grow-as-you-go"
1 icat-server (icat-service + postgresql database local vsan)
OS: CentOS7
3 resource-servers (with 2 local mounts each)
3 datacenters
2 replica's of data
1 replica in 1 datacenter, other replica in other datacenter
- encrypt storage (because cloudstorage)
- all servers are esx vm's (rug-cloud)
- all storage is vmware datastore (rug-cloud)
- all irods-servers/clients connect via SSL
- authentication via ldap
connection from peregrine to irods-servers is 10 Gb ethernet
icat-server: server containing metadata database
irods-resource-server: server with mountpoint storing data
provider: icat-server
consumer: irods-resource server
collections: directories
objects: files
peregrine: our HPC cluster in Groningen
irods installation on centos7 2019:
the icat-server:
- basic/normal configuration
- disable selinux
- enable/configure firewall
- set/enable ntpd
# rpm --import
# wget -qO - | sudo tee /etc/yum.repos.d/renci-irods.yum.repo
# yum install irods-server irods-database-plugin-postgres
# yum install postgresql-server
# postgresql-setup initdb
Initializing database ... OK
# systemctl start postgresql
# su - postgres
Last login: Fri Oct 26 11:30:44 CEST 2018 on pts/0
$ psql
psql (9.2.24)
Type "help" for help.
postgres=# CREATE USER irods WITH PASSWORD 'xxxxx';
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
ICAT | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres +
| | | | | postgres=CTc/postgres+
| | | | | irods=CTc/postgres
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(4 rows)
# vi /var/lib/pgsql/data/pg_hba.conf:
host all all md5
host all all md5
# systemctl restart postgresql
# python /var/lib/irods/scripts/
The iRODS service account name needs to be defined.
iRODS user [irods]:
iRODS group [irods]:
| Setting up the service account |
Existing Group Detected: irods
Existing Account Detected: irods
Setting owner of /var/lib/irods to irods:irods
Setting owner of /etc/irods to irods:irods
iRODS server's role:
1. provider
2. consumer
Please select a number or choose 0 to enter a new value [1]:
Updating /etc/irods/server_config.json...
| Configuring the database communications |
You are configuring an iRODS database plugin. The iRODS server cannot be started until its database has been properly configured.
ODBC driver for postgres [PostgreSQL]:
Database server's hostname or IP address [localhost]:
Database server's port [5432]:
Database name [ICAT]:
Database username [irods]:
Database Type: postgres
ODBC Driver: PostgreSQL
Database Host: localhost
Database Port: 5432
Database Name: ICAT
Database User: irods
Please confirm [yes]:
Database password:
Updating /etc/irods/server_config.json...
Listing database tables...
Salt for passwords stored in the database:
Updating /etc/irods/server_config.json...
| Configuring the server options |
iRODS server's zone name [tempZone]: testZone
iRODS server's port [1247]:
iRODS port range (begin) [20000]:
iRODS port range (end) [20199]:
Control Plane port [1248]:
Schema Validation Base URI (or off) [file:///var/lib/irods/configuration_schemas]:
iRODS server's administrator username [rods]: irods
Zone name: testZone
iRODS server port: 1247
iRODS port range (begin): 20000
iRODS port range (end): 20199
Control plane port: 1248
Schema validation base URI: file:///var/lib/irods/configuration_schemas
iRODS server administrator: irods
Please confirm [yes]: yes
iRODS server's zone key:
Zone key must be at least 1 character in length.
iRODS server's zone key:
iRODS server's negotiation key (32 characters):
Negotiation key must be exactly 32 characters in length.
iRODS server's negotiation key (32 characters):
Control Plane key (32 characters):
Updating /etc/irods/server_config.json...
| Setting up the client environment |
iRODS server's administrator password:
Updating /var/lib/irods/.irods/irods_environment.json...
| Setting up default vault |
iRODS Vault directory [/var/lib/irods/Vault]:
| Setting up the database |
Listing database tables...
Creating database tables...
| Starting iRODS... |
Validating [/var/lib/irods/.irods/irods_environment.json]... Success
Validating [/var/lib/irods/VERSION.json]... Success
Validating [/etc/irods/server_config.json]... Success
Validating [/etc/irods/host_access_control_config.json]... Success
Validating [/etc/irods/hosts_config.json]... Success
Ensuring catalog schema is up-to-date...
Updating to schema version 2...
Updating to schema version 3...
Updating to schema version 4...
Updating to schema version 5...
Catalog schema is up-to-date.
Starting iRODS server...
| Attempting test put |
Putting the test file into iRODS...
Getting the test file from iRODS...
Removing the test file from iRODS...
| iRODS is installed and running |
installation of irods-resource-server:
- disable selinux
- enable/configure firewall
- set/enable ntpd
install irods-repository:
# rpm --import
# wget -qO - | sudo tee /etc/yum.repos.d/renci-irods.yum.repo
# yum install epel-release
# yum install irods-server
# python /var/lib/irods/scripts/
set this server to a consumer (resource-server) provider= icat-server
encrypt storage:
create keyfile:
# echo "some difficult string" >> /etc/keyfile
# chmod 600 /etc/keyfile
# cryptsetup luksFormat -y -v /dev/sdb --key-file /etc/keyfile
# cryptsetup luksFormat -y -v /dev/sdc --key-file /etc/keyfile
open encrypted storage:
# cryptsetup luksOpen /dev/sdb irods01 --key-file /etc/keyfile
# cryptsetup luksOpen /dev/sdc irods02 --key-file /etc/keyfile
format storage:
# mkfs.xfs /dev/mapper/irods01
# mkfs.xfs /dev/mapper/irods02
mount storage:
# mount /dev/mapper/irods01 /mnt/01/
# mount /dev/mapper/irods02 /mnt/02/
create resources:
as user irods on whatever irods-server:
iadmin mkresc ReplA replication
iadmin mkresc ReplB replication
iadmin mkresc ReplC replication
iadmin mkresc Vol01
iadmin mkresc Vol02
iadmin mkresc Vol11
iadmin mkresc Vol12
iadmin mkresc Vol21
iadmin mkresc Vol22
iadmin addchildtoresc ReplA Vol02
iadmin addchildtoresc ReplA Vol11
iadmin addchildtoresc ReplB Vol01
iadmin addchildtoresc ReplB Vol22
iadmin addchildtoresc ReplC Vol12
iadmin addchildtoresc ReplC Vol21
iadmin mkresc pta passthru
iadmin mkresc ptb passthru
iadmin mkresc ptc passthru
iadmin addchildtoresc pta ReplA
iadmin addchildtoresc ptb ReplB
iadmin addchildtoresc ptc ReplC
iadmin mkresc Randy random
iadmin addchildtoresc Randy pta
iadmin mkresc pt_top passthru
iadmin addchildtoresc pt_top Randy
p216149@pg-interactive:~ ilsresc
└── Randy:random
├── pta:passthru
│ └── ReplA:replication
│ ├── Vol02:unixfilesystem
│ └── Vol11:unixfilesystem
├── ptb:passthru
│ └── ReplB:replication
│ ├── Vol01:unixfilesystem
│ └── Vol22:unixfilesystem
└── ptc:passthru
└── ReplC:replication
├── Vol12:unixfilesystem
└── Vol21:unixfilesystem
p216149@pg-interactive:~ ils -l
g.j.c.strikw 0 pt_top;Randy;ptb;ReplB;Vol01 515106669 2019-06-13.16:48 & tivo.tar.gz
g.j.c.strikw 1 pt_top;Randy;ptb;ReplB;Vol22 515106669 2019-06-13.16:48 & tivo.tar.gz
file: tivo.tar.gz is stored on Vol01 and on Vol22 (replicated by ReplB resource)
p216149@pg-interactive:~ iput ./package.tar.gz
p216149@pg-interactive:~ ils -l
g.j.c.strikw 0 pt_top;Randy;pta;ReplA;Vol02 36609 2019-07-03.11:24 & package.tar.gz
g.j.c.strikw 1 pt_top;Randy;pta;ReplA;Vol11 36609 2019-07-03.11:24 & package.tar.gz
file: package.tar.gz is stored on Vol02 and on Vol11 (replicated by ReplA resource)
client-config looks like this:
p216149@pg-interactive:.irods cat irods_environment.json
"irods_client_server_negotiation": "request_server_negotiation",
"irods_client_server_policy": "CS_NEG_REQUIRE",
"irods_connection_pool_refresh_time_in_seconds": 300,
"irods_default_hash_scheme": "SHA256",
"irods_default_number_of_transfer_threads": 4,
"irods_default_resource": "pt_top",
"irods_encryption_algorithm": "AES-256-CBC",
"irods_encryption_key_size": 32,
"irods_encryption_num_hash_rounds": 16,
"irods_encryption_salt_size": 8,
"irods_host": "",
"irods_match_hash_policy": "compatible",
"irods_maximum_size_for_single_buffer_in_megabytes": 32,
"irods_port": 1247,
"irods_transfer_buffer_size_for_parallel_transfer_in_megabytes": 4,
"irods_user_name": "",
"irods_zone_name": "rug",
"schema_name": "irods_environment",
"schema_version": "v3"
july 2019:
- icat: daily dump of pg-database to /var/backups/ daily backup to our tivoli TSM system
- resc-servers: daily backup of /mnt/vol<number>/Vault/Trash/ to our tivoli TSM system
so we only backup the trash! Which is most of the time the data users want back after error-deletion
metalnx webfrontend for irods:
checkout software:
$ git clone
create db for metalnx on postgres:
$ (sudo) su - postgres
postgres$ psql
psql> CREATE USER metalnx WITH PASSWORD '<db password metalnx>';
change config:
vi /home/ger/metalnx/metalnx-web/etc/irods-ext/
$ cat<ipaddress icat-server>
irods.zoneName=<your zone-name>
irods.admin.password=<irods admin pass>
# metalnx database settings
db.url=jdbc:postgresql://<ip-address icat-server:5432/IRODS-EXT
db.password=<db password metalnx>
$ docker run -d -p 8080:8080 -v /home/ger/metalnx/metalnx-web/etc/irods-ext:/etc/irods-ext --add-host hostcomputer: --name metalnx irods/metalnx:latest
Future work:
- clean up the trash regularly (script?)
- build more irods environments/playgrounds to learn/test/play/fun
- set up auditing (ampq with ELK stack backend)
- set a performance baseline
- find out user needs (budget, storage, performance)
- create replication-check-scripts (check/pinpoint/report missing replica's)
- do some disaster drills/scenario's
- create 2 resource servers in irods on datahandeling nodes (Lustre backend, IB network, direct connected to peregrine)
- performance testing (what will be the current bottleneck?)
- adding more icat-servers (behind F5 loadbalancer) connected to a separate database(cluster) (icat-scaleing)
- create landingzone on peregrine (for irods to pick up files automated)
- compute-to-data, data-to-compute testing
- irods-hpc-testing: integration metadata BeeGFS, integration metadata Lustre, let iRODS read changelogs@metadata
- storage-tiering: tape-archive
- test out this new iput-on-steriods for HPC performance testing/differences
- test with S3 object store as storage-backends (big-data-not-on-filesytem, but big-data-object-storage)