udocker: Docker for users with no privileges

logo

udocker is a user tool to execute simple docker containers in user space, no root privileges required.

Containers and specifically Docker containers have become pervasive in the IT world, either for development of applications, services or platforms, the way to deploy production services, support CI/CD DevOps pipelines and the list continues.

One of the main benefits of using containers is the provisioning of highly customized and isolated environments targeted to execute very specific applications or services.

When executing or instantiating Docker containers, the operators or system administrators need in general root privileges to the hosts where they are deploying those applications or services. Furthermore, those are in many cases long running processes publicly exposed.

In the world of science and academia, the above mentioned benefits of containers are perfectly applicable and desirable for scientific applications, where for several reasons users often need highly customized environments; operating systems, specific versions of scientific libraries and compilers for any given application.

Traditionally, those applications are executed in computing clusters, either High Performance Computing (HPC) or High Throughput Computing (HTC), that are managed by a Workload Manager. The environment on those clusters is in general very homogeneous, static and conservative regarding operating system, libraries, compilers and even installed applications.

As such, the desire to run Docker containers in computing clusters, has arisen among the scientific community around the world but, the main issue to accomplish this has to do with the way Docker has to be deployed in the hosts, raising many questions regarding security, isolation between users and groups, accounting of resources, and control of resources by the Workload Manager among others.

To overcome the issues of executing Docker containers in such environments (clusters), several container framework solutions have been developed, among which is the udocker tool.

At the time of writing, udocker is the only tool capable of executing docker containers in userspace without requiring privileges for installation and execution, enabling execution in both older and newer Linux distributions. udocker can run on most systems with minimal Python versions 2.6 or 2.7 (support for Python3 is ongoing). Since it depends almost solely on the Python standard library, it can run on CentOS 6, CentOS 7 and Ubuntu 14.04 and later.

As such the main target of udocker are users (researchers) that want or need to execute applications packaged in Docker containers in “restricted” environments (clusters), where they don’t have root access and where support for execution of containers is lacking. udocker incorporates several execution methods thus enabling different approaches to execute the containers across hosts with different capabilities.

Furthermore, udocker implements a large set of commands (CLI) that are the same as Docker, easing the use of the tool to those that are already familiar with Docker. udocker interacts directly with the Dockerhub and run or instantiate images and containers in Docker format.

One of the features of the correct version is the ability to run CUDA applications in GPU hosts using official NVIDIA-CUDA Docker base images, through a simple options to set the execution mode to nvidia. Work is ongoing on the design to have MPI applications to be executed in the most easiest possible way.

udocker lives in this git repository, the Open Access paper can be found here. A slide presentation with details of the tool and a large set of examples can be found here

DEEPaaS training facility available in EOSC

Many research areas are being transformed by the adoption of machine learning and deep learning techniques. Research e-Infrastructures should not neglect this new trend, and develop services that allow scientists to employ these techniques, effectively exploiting existing computing and storage resources.

The DEEP-Hybrid-DataCloud (DEEP) is paving the path for this transformation, providing machine learning and deep learning practitioners with a set of tools that allow them to effectively exploit the existing compute and storage resources available through EU e-Infrastructures for the whole machine learning cycle. 

The DEEP solutions supporting machine learning and deep learning are now available in the European Open Science Cloud (EOSC) portal 

These solutions provide services that allow scientists to:

  • Build a model from scratch or use an existing one
  • Train, test and evaluate a model
  • Deploy and serve as a service (DEEPaaS)
  • Share and publish a model

The key technologies provided by the DEEP include:

  • Docker container based
  • Transparent GPU access
  • HPC integration
  • Serverless architectures
  • Transparent hybrid cloud deployments through PaaS layer
  • Marketplace containing existing models ready to use
  • Standard APIs for model training, testing and inference
  • Integration with EOSC data services