After months of understanding in principle1, but not being able to follow through in practice, we may be getting to a point where deploying a dev environment for FutureGateway is possible.

“Wait, what ? Wasn’t there always a set of scripts for installing the FutureGateway components ?”, I hear you counter. To that I counter with a roll of the eyes and a sigh.

What’s wrong with the setup scripts ?

  1. They are scripts
  2. I rest my case23
Thanks to Valdhaus for putting it more eloquently than I ever could

Ok, in the interest of maintaining some form of objectivity, let’s discuss for a second why the scripts in the setup repo are inadequate for deploying a dev environment. The first point comes from consideration of what a dev environment is. When we ran our hackfests4, we had an event site5 intended to help the participants prepare for the hack. This included a set of warmup exercises6 for them to do before coming - some of which included making calls against an existing set of the services. These were all deployed in the hackfest preparation CI environment 7, which we wanted to consist of a set of services acting as a dev environment. The idea was that if participants could make their code pass tests in our dev environment, it could probably work out in the real world.

To be honest, we never got around to having a fully automated deployment of the dev environment, even though we did provide a set of services which could be tested against. This meant that we couldn’t (and at the time of writing, we still can’t) fully reproduce the entire environment of the Open Science Platform - but we’re working on it. Reproducibility means achieving the same state even if other factors are changed, such as the operating environment. Herein lies the problem with the shell scripts produced for the FutureGateway deployment - they only work in one environment. Futhermore, they only work if you make a whole bunch of assumptions about that environment.

Some improvement was made in making the installation of the components more stable, by “porting”8 the shell scripts to Ansible playbooks

The Ansible playbooks could use a massive command: purge and a bit of a re-write, but at least they give a good starting point for writing a reproducible orchestration of a dev environment. The first attempt to do this was with AnsibleContainer9 - but that attempt provide somewhat premature. I moved on to writing a playbook which explicitly used the docker_container module. Now, this is not what we want, because we’re still forcing the dev environment to explicitly be based on Docker, and we’re assuming that we have access to the Docker endpoint - both of these are pretty big assumptions ! - but it’s still a reduction in the scope of our assumptions, I think. Most of all, because it allows us to Docker images10. Another major point is that we start from existing well-tested images and extend them with our specific configuraitons.

So, now we have :

  1. A playbook that creates three containers and uses the docker connection to configure them, applying three roles : AAROC.fg-api, AAROC.fg-db, AAROC.fg-ge
  2. The AAROC.fg-api role starts from the Apache image providing httpd and applies the WSGI Python module and the FutureGateway API (via the python flask script)
  3. The AAROC.fg-db role starts from the MySQL community image providing the mysql database, and imports several necesary databases and their schema11
  4. The AAROC.fg-ge role starts from the official Tomcat image and imports the various jars and applications necessary to provide the GridEngine SAGA and OCCI connectors.

The work is still ongoing, and some major refactoring has to be done in order to improve the grouping of variables, password protection, etc. Another important factor which was glossed over initially was the data persistence and volume claims for database, but also the user data (input/output sandboxes, e.g.). This will come in a later release.

What this does now though is provide a far better understanding of how to compose the services necessary for the FutureGateway, in the Open Science Platform, making it easier for science gateway developers to set up their development environments locally.

References and Footnotes

  1. I was so naive I even wrote this hilariously triumphant blog post back in September last year. 

  2. We make this point in the “Setting the Stage” part of the DevOps Bootcamp 

  3. Bruce Becker, Chris Rohrer, & Marco Fargetta. (2017, January). AAROC/AnsibleBootCamp: Ansible BootCamp - Entebbe. Zenodo. http://doi.org/10.5281/zenodo.242394 

  4. The Sci-GaIA project ran three e-Research Hackfests during the course of the project, in Catania, Lagos and Addis

  5. Bruce Becker. (2016, December 18). AAROC/hackfest-site: e-Research Hackfest Website : Lagos. Zenodo. http://doi.org/10.5281/zenodo.208217 

  6. Bruce Becker, & Mario Torrisi. (2016, December 18). AAROC/hackfest-warmups: e-Research Hackfest Warmups : Lagos. Zenodo. http://doi.org/10.5281/zenodo.208218 

  7. Bruce Becker, & Mario Torrisi. (2016, December 18). AAROC/e-Research-Hackfest-prep: e-Research Hackfest Preparation : Lagos. Zenodo. http://doi.org/10.5281/zenodo.208216 

  8. A ~600 line all-in-one playbook consisting of shell:s and command:s is a pretty good example of “you’re doing it wrong.” 

  9. See the previous post 

  10. We’ve pushed the aaroc/fg_api, aaroc/fg_db and aaroc/fg_ge repositories for the API, user tracking and events database and grid engine with SAGA and OCCI plugins respectively. 

  11. We still need to properly configure the data volumes which will contain these databases, so that we can make them portable with Ansible Container later.