This was originally a short introduction to a panel discussion on training researchers for e-Science, at the e-Research Africa conference 2014. It is meant not to be a comprehensive critique of the state of affairs, but rather evoke imagery and metaphors to act as discussion points. The opinions and rhetorical questions raised are neither complete nor institutional representations of fact, but opinions and reflections of the author.

The panel consisted of Peter van Heusden (University of the Western Cape), Jonah Duckles (University of Oklahoma), Anelda van der Walt (University of Cape Town), James Hetherington (University College, London).

Images are courtesy of the Saint-Exupéry Foundation. The thoughts revolve around quotes which I’ve selected from the book “Terre des Hommes”.


What land awaits us ?

The investments in e-Infrastructure have been astounding, and the returns on those investments have exceeded the expectations of many. We scientists are used to change - indeed, we are the drivers of change in many cases - but we are still human beings. We are still have our need for comfort, even though we are driven by curiosity. We still want our tools to make sense to us, and to adapt to our needs, not vice-versa. Let us also not forget that the university learning environment, as a general rule, is very conservative compared to the “wild” out there in the “real world”. These phrases are in quotes, because all depends on what one is really referring to, and in this case, we are referring to the environment in which reasearch is conducted. At times competitive, at times collaborative, it does not accept mere results as the final arbiter of its own success, but the method is crucial.

There is sometimes a tension then between getting things done and getting things done right.

The researcher as pilot

We come then to a metaphor :

"L'homme se découvre quand il se mesure avec l'obstacle."
- A. de Saint-Exupéry"

A dejected pilot stands beside his crashed airplane. He, like us, is an explorer. He has chosen to fly to new lands, and has crashed in a desert - land unknown to human feet. He will eventually repair his airplane, or be rescued by his fellow explorers, but has only the tools at hand to survive in the meantime. Not knowing what awaits him, he has taken on that airplane what one might call a “general solution” - the basic tools which can be used in any situation, given enough ingenuity and patience. A hammer, a radio transmitter, a wrench, a compass, a blowtorch, a warm jacket, a rifle (let us not forget that he has crashed in a very hostile territory).

These are his tools for disaster recovery, but but it is above all his aeroplane that is his tool for discovery. That aeroplane seems, to those who will follow him, an inhuman device. They look back, with nostalgia to a more “civilised” time, when one voyaged by boat or by horse. They are loath to board it and do so only in the direst of need. That aeroplane, after all, crashed in the desert ! Yes, but had it not crashed, certain discoveries would not have been made. In that crash, man measures himself against an obstacle, and in doing so, discovers…

The infrastructure as aeroplane

[…] la vie du passé nous semble mieux répondre a notre nature, pour la seule raison qu’elle répond mieux a notre langage.

We are presented with an instrument of discovery, but it seems from “the future”. Its controls, it’s internal functioning, are exposed and seem foreign to us and we often find ourselves unable to find the terms needed to express our need for or usage of it. Sometimes, this makes it hard to ask for newer, or better aeroplanes… after all, if it is only to deliver packages, why we can already do that with a boat or a train ? And these don’t crash in the middle of the desert ! Yes, but aeroplanes go where boats or trains cannot, they see from a different angle, and from their altitude, the world takes on a vastly different aspect.

St. Exupéry’s point here is that we are beings bound by the language we speak, the terms we use - and often our reticence to adopt new ways is due to our sense of being at a lack of words. It is in our nature to explore, but in exploring, we need to remind ourselves that it is the ends, not the means which drives us.

Nous sommes tous de jeunes barbares que nos jouets neufs émerveillent encore. […]

Yes, we, these explorers, do tend to get blinded by the shiny. We forget that language is important. We talk about Terabytes and nanoseconds and all manner of technical jargon that does not answer the researcher’s question of :

What am I doing here ?

The infrastructure as home

Pour le colonial qui fonde un empire, le sens de la vie est de conquérir. Le soldat meprise le colon, mais le but de cette conquete n’etait-il pas l’etablissement de ce colon ? […]

We forget that the aeroplane was made for a reason, which was to arrive somewhere. The very first ones were built essentially for the pilot, whose driving passion is to discover the unknown through discovery. But later ones were built for passengers, whose goal is simply to arrive. Their passion is the relative, or the business, or the exotic fruit, at the end of the journey and for them the aeroplane, although perhaps intriguing, is not of primary importance.

How was the aeroplane built ? The very first ones had all of the tubes, knobs, dials and guages in plain sight (if one will forgive the homonymic pun), because the pilot needed to know every detail of the internal working of the aeroplane. However, later models hid these details both from pilots as well as of course from the passengers, since we came to learn the habits and workings of the machine. We made it less mysterious, and more functional, to respond better to our human, (or in this case, scientific), rather than technical needs.

What home awaits us ?

La verité pour l’un, fût de bâtir,
[…] elle est, pour l’autre, de l’habiter.

Later, once this unknown desert has been traversed by man, we built roads, way-points, service stations. These were the needs of the pilot - the crazy scientists who, freed of the shackles of earth, roam, explore and discover, crash after crash, the secrets of the desert. However, who comes after them ? In this drawn-out analogy, we come finally to the point: those who come after them are, metaphorically speaking, the students. The pioneering pilots and engineers having built an infrastructure, may move on to deeper unknowns (we still have entire planets to explore, after all), but what about those who came to inhabit this new land ? They who will inherit, cultivate and improve it ?

If the land we have discovered and are exploring is to become a fruitful place for science and collaboration, it needs more than roads, waypoints and service stations. It needs schools, markets, exchanges, town-halls, ports, police stations, civil protection, banks. In the world of e-Science, how far have we come in building these - the vestiges of a place we might call home ?

It’s time to get real

Ok, enough of the metaphor, what exactly is the point of this parable ? First of all, it is indeed just an anchor for some thinking about what we are doing with e-Infrastructure. In building the tools, we sometimes become, as I mentioned before, “blinded by the shiny”. This is detrimental, since no matter how wonderful a piece of hardware or software may be, if it doesn’t serve the intended purpose, it is an exercise in futility. We need to remember that the final judges of this utility are not us pilots or engineers, but the users and learners that follow us. Engagement and empathy with them is important, later down the line.

The short presentation which this article is based on was used as the introduction to a panel discussion on training the next generation of researchers, to be comfortable in the home we are building for them. Happy e-Scientists in a cosy e-Infrastructure, if you will. The discussion that followed made several points clear, foremost amongst them was that the important thing to teach is method as well as means. Yes, means are important: some problems are intractable without HPC or large data sets. Some problems cannot be tackled alone and a means of sharing resources is needed. However, let’s remind ourselves that it’s called science because it follows the scientific method, not because it’s done with scientific tools. Indeed, scientific tools is almost a contradiction in terms.

The services we build should enable and encourage the hallmarks of the proper practice of science - reproducibility, transparency, rigor - as well as those of good academic practice: referencing prior work, openness, impartiality. Furthermore, they should define a nurturing and enabling environment - one which encourages innovation where necessary, and re-use where possible; one which does not erect unnecessary barriers to entry or impede usability, but rather responds with relevant solutions to the problems expressed by it’s own users; and of course one which has a means of listening to those users, and a platform for them to express their needs, desires and frustrations, so that it may provide answers to their question of “What am I doing here ?”

Between the dirt and the stars.

Efficiency and scale are important, but their extremes respond only to the needs of a few who are most likely able to solve their own problems. These few will build better aeroplanes, and discover new lands. Their extreme needs and unexpected discoveries are what gets the public inspired by the majesty of science; they reveal the glow of the stars with a proximity that dazzles, motivates and yet seems comprehensible. However, they are not alone - they stand on those ever-present giants, but those giants are surrounded by the rest of us. In a scientific version of ubuntu, they become giants through us. We cannot, therefore, focus solely on the “stars” - the dazzling, cutting-edge, big-science projects such as the LHC or SKA, although we all aspire to these.

We can neither remain mired in the “dirt” of our earthly existence - the tools and methods that we are taught in high-school or our undergraduate courses. Or rather, we cannot remain satisfied with this barrenness. The seeds planted in the classroom should be adapted to the terrain in which they will grow. The methods we teach should be consummate to the environment in which they will be put into action, and vice versa. We need those schools, markets, exchanges, town-halls, ports, police stations, civil protection, banks to turn our discovered land into an inhabitable place.

Carpenters : build us a home !

Another feature of the e-Research 2014 conference at which this panel discussion was held, as the presence and message of the Software Carpentry movement. We would run a very successful and well-attended Software Carpentry Bootcamp at the University of Cape Town just after the conference, supported by two representatives of the movement - James Hetherington and Jonah Duckles - who came down for the event. They provided their view on how and why training the next generation of researchers should be more “human”. New colonists to this land of discovery should have those first tools to help them grow and plant their own seeds, build their own houses, and then help those who come after them. The role of “software carpenter” comes close to this. They are not engineers, they are not architects, but they can build and use the common tools to get things done.

The Fertile Delta

In a very concrete way, we transmitted these ideas to our software carpentry bootcamp attendees. We showed them

  1. how to use the shell and scripts to execute and automate their work;
  2. how to use python and ipython to build more powerful and flexible workflows;
  3. how to use git to describe, curate and share their work, artifacts and progress.

These simple tools, already known and used by most of them, are exponentially more powerful and useful when they are used in a fertile e-Infrastructure. This is not made solely of massive computing and data resources, but many adjunct services which when building infrastructure may not seem apparently necessary or urgent.

The mirage of plenty

What kind of learning environment do we want for our future generations ? And what kind are we actually building ? University courses often leave it up to the student to discover and adapt to research infrastructure, and do adapt to the current realities of the services at hand. There is not enough feedback between infrastructure providers and teachers (referring here explicitly to the undergraduate level), such that when students graduate, they are left with a mirage. In apparent oasis with a wealth of resources (take the improvements in networking infrastructure provided by SANREN for example), they find themselves however in reality, stuck with an albatross around their neck. To paraphrase Samuel Taylor Coleridge (or Iron Maiden, depending on how much of a metalhead you may be),

Cores, cores, everywhere,
And all the boards did shrink;
Data, data, every where,
Nor any drop to drink

Not Samuel Taylor Coleridge, not "The Rime of the Ancient Mariner"

They have been told that such marvels exist, but how to use them, and what is actually offered is somewhat different. In order to build a “fertile delta”, there needs to be better integration between training, research and innovation, especially mutually-reinforcing aspects.

In an attempt to address these shortcomings and recognising the maturity of the investments in South Africa, we have tried to develop further services which act as “force multipliers” for the underlying infrastructures - the “vestiges of civilisation” alluded to before. We are, of course, by no means alone.

A carpenter in the steel palace

What about the architects and artisans of our environment ? Sure, the skyscrapers and highways - the HPC centres, the NREN, the huge data centres, the core infrastructure services - have been built. Every pinnacle needs a study base, but building beautiful wooden panels or soft embroidered chairs in a “big iron” palace is going to make it far more liveable. Is what we are building accessible to those who would extend, improve and contribute back to it ? Take the “data exchange” for example. Critical data of national importance will be stored in it, and in principle it will be available to all, but if services for annotation, discovery, delivery and attribution cannot interoperate with it, that data will end up only being used by the privileged few with the key to the bank.

Take the case of application development. Often, applications start off as simple experiments, and evolve from a simplistic implementation. If the infrastructure is too static, slow or cumbersome to allow this evolution, it will discourage the adoption of the infrastructure for use, no matter how powerful. The tools that hackers and developers need to be productive : repositories, version control, testing and integration, deployment, issue tracking, documentation, etc all need to work nicely with the infrastructure on which they will run. Since these are so many, the infrastructure needs to make provision for general, open interfaces to the general cases, as far as possible.

We built this city… but all we got was this lousy t-shirt.

Finally, what about the reward mechanisms ? It takes an army of thousands these days to get the big science projects off the ground. Even modest projects have a stratification of roles and responsibilities, all of which require justification and funding. We’ve seen the research infrastructures themselves funding their own e-Infrastructures - they are pilots, engineers, explorers. We now have an alternative model, that of an “e-Infrastructure commons”, whereby massive investments in infrastructure are made, for the use of the scientific commons. These computers, data stores, networks, instruments, etc had to be built by someone; the services delivered over them to research communities had to be developed and adapted by others. The usual academic and commercial feedback and reward mechanisms are perhaps not well-aligned with the goals of this commons community. How many points do you get for producing a paper ? How many for an application that others uses ? How many for collecting and publishing a data set that allows re-use and re-mixing ? How many points does a CSIRT engineer get for ensuring that the entire ecosystem is safe and secure ?

The reward mechanisms in our new land need to take into account the fact that there is great worth in developing something for the common good, even though it may not be immediately measurable.

Wrap it up, dude, dinner’s waiting.

For too many scientists, e-Infrastructure the promise it holds remain an obstacle. The obstacle that the providers of such infrastructure and services are faced with is one of delivery and sustainability.

We are still in an age of discovery, and there is great temptation to think that the advances we have made are due to our new tools, rather than our own ingenuity. As we mature in this new land, we may come to find ourselves with nothing but a shiny, expensive, high-performance white elephant, rather than a productive means of research and scholarly communication. When bringing new colonists to this new land, through training and communication, we should remain sensitive to their needs and ensure that we continue to teach methods - adapted for this new land - as well as tools.

Finally, we need to remain faithful to our goals - those of research and discovery, but also training and innovation - and ensure camaraderie between conquerers and colonists of this new world.

Nous étions autrefois en contact avec une usine compliquée […] Au delà de l’outil, et à travers lui, c’est la vielle nature que nous retrouvons, celle du jardinier, du navigateur, ou du poète.

Images and quotes copyright Fondation Saint Exupery

Next Entry : Reconciliation Day
Previous Entry : Continuous Delivery