We have a vision
The key concept proposed in the DEEP Hybrid DataCloud project is the need to support intensive computing techniques that require specialized HPC hardware, like GPUs or low-latency interconnects, to explore very large datasets. A Hybrid Cloud approach enables the access to such resources that are not easily reachable by the researchers at the scale needed in the current EU e-infrastructure.
We also propose to deploy under the common label of “DEEP as a Service” a set of building blocks that enable the easy development of applications requiring these techniques: deep learning using neural networks, parallel post-processing of very large data, and analysis of massive online data streams . These services will be deployed in the project testbed, offered to the research communities linked to the project through pilot applications, and integrated under the EOSC framework, where they can be further scaled up in the future.
DEEP as a Service
We also propose to deploy under the common label of “DEEP as a Service” a set of building blocks that enable the easy development of applications requiring these techniques: deep learning using neural networks, parallel post-processing of very large data, and analysis of massive online data streams.
These services will be deployed in the project testbed, offered to the research communities linked to the project through pilot applications, and integrated under the EOSC framework, where they can be further scaled up in the future.
We propose a methodology based in three research activities addressing the three required cloud layers: bare metal at IaaS, hybrid cloud at PaaS and DEEP as a service. A networking activity will focus on the pilot applications using three intensive computing techniques, with very relevant impact in different research areas. A service activity will provide a testbed with significant HPC resources, including latest generation GPUs, to evaluate the performance and scalability of the solutions. A DevOps approach will be implemented to provide the chain to ensure the quality of the software and services released, that will also be offered to the delopers of research applications.
The project will evolve to TRL8 existing services and technologies at TRL6+, including relevant contributions to the EOSC by the INDIGO-DataCloud H2020 project, that the project will enrich with new functionalities already available as prototypes, notably the support for GPUs and low latency
interconnects.
The project will make a very significant contribution to the exploitation of very large data by EU research teams, both through the exploitation of existing HPC resources under an Hybrid Cloud approach, and by promoting a new generation of e-infrastructures able to integrate and offer as a service these computing intensive techniques, both in the academic and industrial context.
By means of the “DEEP as a Service” solution, the project aims to lower the access barrier to the scientists. Three pilot applications in Biology, Physics and Network Security are proposed, and further pilots for dissemination into other areas like Medicine, Earth Observation, Astrophysics, and Citizen Science will be supported in the testbed.
It is fundamental for researchers in Europe being able to use the last generation of intensive computing techniques, together with first-class support on the available e-infrastructures at EU level. Only a joint strategy addressing both issues will enable them to enter the very dynamic competition at worldwide level for innovation and production of new knowledge.
The project proposes clear dissemination and exploitation plans addressing five different target audiences: e-infrastructure providers, technical and research communities, education and citizen science, with different actions linked to the EOSC, including training and the participation in thematic conferences and major events.
The work plan is structured in six phases, along 30 months, with relevant milestones defined ensuring the connection between the work packages. A light but engaging management scheme is proposed, based on the experience in previous projects. The project will deliver open source solutions, and publish under an open access scheme. The total project effort estimated is equivalent to 14 FTE (Full Time Equivalent), that translates to a total budget close to 3M Euros.
The project will benefit of the experience and know-how of a well balanced set of partners from leading research centres in Europe, that have successfully collaborated in the H2020 INDIGO-DataCloud and EGI-Engage projects. They include major players in the e-infrastructure area in EU, with high expertise in Cloud middleware, and direct engagement with researchers in the development and support to applications in different research areas. All of them have identified the area of “DEEP services” based on intensive computing techniques as an strategic one in the next years, and identified the Hybrid Cloud approach as the path to follow.
The experience of the consortium partners in previous projects, and the relevance of the research teams involved, are the best guarantee for a successful implementation of the project.