DEEP Hybrid DataCloud accepted paper in 22nd International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES2018), Sep 3-5, 2018, Belgrade, Serbia

We are proud to announce that a research paper developed under the Deep Hybrid DataCloud Project has been accepted for inclusion in 22nd International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES2018) to be held on 3-5 September 2018, in Belgrade (Serbia). This paper will be published by Elsevier Science in the open-access Procedia Computer Science series on-line.

Title: A multivariate fuzzy time series resource forecast model for clouds using LSTM and data correlation analysis

Authors, Nhuan Trana, Thang Nguyena, Binh Minh Nguyena, Giang Nguyenb

a School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi, Vietnam

b Institute of Informatics, Slovak Academy of Sciences, Bratislava, Slovakia

Abstract

Today, almost all clouds only offer auto-scaling functions using resource usage thresholds, which are defined by users. Meanwhile, applying prediction-based auto-scaling functions to clouds still faces a problem of inaccurate forecast during operation in practice even though the functions only deal with univariate monitoring data. Up until now, there are still very few efforts to simultaneously process multiple metrics to predict resource utilization. The motivation for this multivariate processing is that there could be some correlations among metrics and they have to be examined in order to increase the model applicability in fact. In this paper, we built a novel forecast model for cloud proactive auto-scaling systems with combining several mechanisms. For preprocessing data phase, to reduce the fluctuation of monitoring data, we exploit fuzzification technique. We evaluate the correlations between different metrics to select suitable data types as inputs for the prediction model. In addition, long-short term memory (LSTM) neural network is employed to predict the resource consumption with multivariate time series data at the same time. Our model thus is called multivariate fuzzy LSTM (MF-LSTM). The proposed system is tested with Google trace data to prove its efficiency and feasibility when applying to clouds.

Digital Infrastructures for Research 2018 (DI4R) and IBERGRID 2018

The third edition of the annual Digital Infrastructures for Research (DI4R) conference was hold this year at the University Institute of Lisbon (ISCT-IUL), from October 9th to 11th. Co-located with the event, IBERGRID 2018, the 9th Iberian Grid and Cloud conference with the objective to move towards the European Open Science Cloud.

During the first day of the project our networking poster was presented by Zdeněk Šustr CESNET during the innovative zapping session: 40 minutes for 40 posters.

The 2nd day contained several DEEP related slots. The DEEP project organized, jointly with colleagues from eXtreme-DataCloud, a world cafe session focused on how we are trying to deliver tools to support researchers in the EOSC entitled “Open, Effective and Innovative tools to support reserchers in Worldwide Infrastructures“. During that session, user communities took the word in order to explain what they are expecting from projects such as DEEP and XDC. Afterwards, an overview of the technology from both projects was done by both project coordinators, followed by three different demonstrations on how the actual technology worked.

 

After this session several networking discussions took place about potential future collaborations with ongoing projects. In parallel, a general overview presentation of DEEP and the services that would be delivered to users was done, within the computing services session. Also during that day, a more technical presentation also took place, about the composition and deployment of complex container-based application architectures on multi-clouds.

On 11th and 12th, the 9th IBERGRID conference took place, paving the road towards the EOSC. DEEP partners were also present in this conference, as several key partners are part of IBERGRID.

Report on the workshop: New challenges in Data Science: Big Data and Deep Learning on Data Clouds in Santander (Spain)

The Spanish National Research Council (CSIC) as coordinator of the DEEP-Hybrid-DataCloud project has organized together with the eXtreme-DataCloud project the “New challenges in Data Science” workshop, in the context of the Advanced Summer Courses offered by the prestigious Universidad Internacional Menendez Pelayo (UIMP). This workshop took place from June 18th to 22nd in Santander, at the Palacio de la Magdalena UIMP venue, with the participation of more than 20 experts and students from all around Europe.

The objective of the course was to review and discuss the current research trends and European initiatives regarding the infrastructure support to compute intensive data analytics techniques over massive amounts of data, making special emphasis on Deep Learning, over High Performance Computing (HPC) and hybrid Cloud Platforms.

The first half of the course started with an introductory session by Fernando Aguilar (CSIC), entitled “Understanding Researchers Requirements: a Data Science perspective”, serving to frame the discussion that took place over the next two days, where Scientists from the Italian National Institute for Nuclear Physics (INFN), the German Electron Synchrotron (DESY), the Annecy-le-Vieux Particle Physics Laboratory (LAPP, France), the CSIC and the European Clinical Research Infrastructure Network (ECRIN) analized and studied different use cases in different areas (such a Astrophysics and Particle Physics, Bioinformatics and Biodiversity) with the objective of understanding the present and future challenges that are to be tackled in these scientific areas over the next years. This first part of the course finalized by a joint wrap up and conlussions session by Daniele Cesini and Alessandro Costantiti both from INFN, project coordinators of the eXtreme-DataCloud project.

Wednesday 20th started with an introduction to the European Open Science Cloud (EOSC) done by Isabel Campos Plasencia (CSIC, member of the European High Level Expert Group for the EOSC), followed by a presentation from Giacinto Donvito (INFN ) about the EOSC-hub (an EU project implementing a service hub for the EOSC) catalogue of services and rules of engagement. The morning session concluded with Pablo Orviz from CSIC, presenting the software quality procedures and trends in the EOSC.

The last part of the course started on Wednesday afternoon and was focused on the description of practical deployments and implementations of the tools required to perform the aforementioned massive data processing on top of cloud infrastructures. Wolfgang zu Castell, from the Helmholtz Zentrum Muenchen, put the focus on the current deep learning techniques, and what are the missing gaps in current e-Infrastructures to be effectively exploited.

Tuesday 21st started with the DEEP-Hybrid-DataCloud approach to deploy advanced services over hybrid clouds. Firstly, the project architecture was described and demonstrated by Álvaro Lopez (CSIC, project co-coordinator), followed by Andy S. Alic (Universitat Politècnica de València, UPV) describing how complex applications and services can be graphically composed in order to be deployed over hybrid clouds. The day finalized with an overview of the High Performance Computing panorama and how to effectively exploit these services by using adavanced computing techniques, such as containerization, a session held by Jorge Gomes from the Portuguese Laboratory of Instrumentation and Experimental Particle Physics (LIP).

The workshop concluded on Friday with two general sessions. First of all a debate around data science ethics, guided by Steve Canham from ECRIN, took place. Participants raised interesting questions about ethical questions regarding data, data science, privacy and security. The final wrap up and conclusion session was in charge of Jesús Marco, coordinator of the DEEP-Hybrid-DataCloud project and Director of the course, setting the basis for future work in the data science area. The official closing ceremony of the event was done by Miguel Ángel Casermeiro, General Secreatary of the UIMP.

D6.2 – Design for the DEEP as a Service solution

This document provides a description of the design, architecture and work plan the DEEPHybridDataCloud Work Package 6 (WP6) in order to provide the DEEP as a Service solution. As such it provides an overview of the state of the art of the relevant components and technologies, as well as a technology readiness level assessment with regards to the required functionality, the required interactions with other work packages in the project, as well as the detailed work plan and risk assessment for each of the activities.

 

http://digital.csic.es/handle/10261/164314

 

D4.1 – Assessment of available technologies for supporting accelerators and HPC, initial design and implementation plan

This document describes the state of the art of technologies for supporting bare-metal, accelerators and HPC in cloud and proposes an initial implementation plan. Available technologies will be analyzed from different points of views: stand-alone use, integration with cloud middleware, support for accelerators and HPC platforms. Based on results of these analyses, an initial implementation plan will be proposed containing information on what features should be developed and what components should be improved in the next period of the project.

 

http://digital.csic.es/handle/10261/164313

 

D2.1 – Initial Plan for Use Cases

This report summarises the work of WP2 on the initial plan on the selection of Use Cases, providing a key input to the design of the DEEP-HybridDataCloud testbed and laying out the DEEP-HybridDataCloud solutions will be used. This document includes a description of the Research Communities involved and of the use-cases proposed, including Data Management and Computational Intensive issues. Based on the inputs provided by the WP2 partners, an initial list of requirements for each application has been collected, that has been used in turn to assemble a list of common requirements that has been provided as input to the discussion with the technical work packages of the DEEP-HybridDataCloud project.

 

http://digital.csic.es/handle/10261/164311

 

New challenges in Data Science: Big Data and Deep Learning on Data Clouds in Santander (Spain) on 18-22 June 2018

The DEEP Hybrid DataCloud consortium is proud to announce the upcoming seminar which will be directed by Dr. Jesús Marco de Lucas and assisted by Dr. Álvaro López, both Project Coordinators of the DEEP Hybrid DataCloud project.

This seminar will be hosted by UIMP within its Summer Advanced Courses Program and organized in collaboration with IFCA, CSIC and DEEP Hybrid DataCloud, and it is targeted at specialists and students at the different academic levels (master, graduate students, PhD candidates, postdoctoral students and senior scientists) interested in current research trends regarding compute intensive data analytics techniques over massive amounts of data, making special emphasis on Deep Learning, over High Performance Computing (HPC) and hybrid Cloud Platforms.

In addition to the seminar official agenda, DEEP-Hybrid-DataCloud and XDC Extreme DataCloud projects will carry out several parallel work sessions related to projects ongoing tasks.

More information about the seminar can be found here and the detailed agenda for both events (summer course and project meetings) here.

Sponsored by: Advanced Computing and e-Science group; IFCA (CSIC-UC), Deep-Hybrid-DataCloud project (CSIC, Atos)
Course code: 63zh – Face C–1 ECTS
June 18th to 22nd 2018

DEEP Hybrid DataCloud participation in the EOSC-hub Week in Malaga (Spain) on 16-20 April 2018

The first EOSC-hub Week took place on 16-20 April 2018 in Málaga, Spain. The week revolved around two major events: the public daysc and an EOSC-hub “all hands” meeting open only to EOSC-hub partners.

The public days, sponsored by the EGI Foundation, the and the XDC project, welcomed the participation of service providers, representatives of the research communities and policy makers engaged in the establishment of the European Open Science Cloud (EOSC). Interesting presentations took place, such as the ones from Augusto Burgueño, head of the Directorate‑General for Communications Networks, Content and Technology (DG-CONNECT), where the “Implementation Roadmap for the European Open Science Cloud” was presented, or Isabel Campos, giving the vision of the High level Expert Group (HLEG) for the EOSC.

EOSC action lines as presented by Augusto Burgueño

We find specially relevant this HLEG interim report, as some of the work areas that this expert group considers as key priority for making the EOSC a viable ecosystem are between our work priorities. As a matter of fact, and just to cite one example, we think that delivering quality services is key for making the EOSC a viable ecosystem, being this one of the reasons we already elaborated a common software assurance baseline criteria together with the XDC and INDIGO-DataCloud projects.

HELG for the EOSC vision on Incentives for Software Developments

Our project was present on the second day during the “Data & Compute: Joint XDC-EUDAT-DEEP and eINFRA-21 initiatives“. The DEEP-Hybrid-DataCloud project coordinator, Alvaro López García, was in charge of presenting the current project status and next steps in this joint session, where a panel discussion about possible synergies and collaboration between all these projects and initiatives also took place.