PRR Project
Assistant Researcher in Computer Science (Hyperscale Systems)
Project sheet
Name
Assistant Researcher in Computer Science (Hyperscale Systems)Total project amount
81,29 thousand €Amount paid
81,29 thousand €Non-refundable funding
81,29 thousand €Loan funding
0 €Start date
01.06.2025Expected end date
31.03.2026Dimension
ResilienceComponent
Qualifications and SkillsInvestment
Science Plus TrainingOperation code
02/C06-i06/2024.P2023.14760.TENURE.011Summary
Artificial Intelligence and Big Data Analytics are re-shaping the future of our society. Indeed, extracting useful and insightful information from data is key for the competitiveness and efficiency of critical sectors such as Health, Finance, e-Governance, Science. However, as the amount of data generated worldwide grows, so does the complexity of analyzing it, requiring specialized hardware, hundreds of servers, and a large capacity to store the information being processed and the outputs being generated. A good example is the current trend on Large Language Models (LLMs), similar to ChatGPT, that require Petabytes of data, and computing power from hundreds to thousands of servers, to be effective.This is where High-Performance Computing (HPC) centers play a crucial role, i.e., by providing high-end hardware (GPUs, TPUs), and large-scale resources (e.g., in the order of tens of thousands of computing nodes, and several petabytes of storage capacity). With such resources, these hyperscale infrastructures can accommodate demanding workloads from multiple users simultaneously. However, as infrastructures get larger, and applications more demanding, one faces two main challenges.Finding an optimal approach for managing available computing and storage resources becomes increasingly difficult. There are several workloads, from different users, running at the same time and competing for HPC resources. This can easily lead to performance bottleneck scenarios where the whole HPC cluster becomes unavailable. For instance, this is happening in large supercomputers from TACC (USA) and AIST (Japan), where bursts of data access lead to severe contention at shared storage resources and, eventually, to service downtime.Large clusters, and demanding applications ( e.g., requiring GPUs) are known for consuming significant energy. Although performance is a key concern of HPC centers, one cannot forget the need to lower our carbon footprint and have more sustainable solutions. Therefore, the management of computing and storage resources cannot be made only from the point of view of performance and scale, it also needs to have energy efficiency in mind.These two research challenges justify the proposed job, which will focus on advancing science and technology in two main focal research questions: i) how can one make hyperscale infrastructure more efficient, in terms of performance and scale, when handling today´s demanding workloads that require high computational power and large storage capacity?; ii) how can one achieve performance and scalability, while still guaranteeing sustainability and energy efficiency for these large-scale infrastructures?Our research on development in this area is directly contributing to Sustainable Development Goals of the United Nations 2030 Agenda 7, 9, and 12 in the context of projects such as: Sustainable HPC - Highly Sustainable Performance Computing (FAI/FEE), Epicure (DIGITAL-EUROHPC-JU-2022-APPSUPPORT-01) and HANAMI (EU-Japan Partnership). Informatics Department and INESC TEC have hosted, in the scope of these projects, a number of young researchers / teaching assistants that are key to their success and would be eligible for an FCT Tenure position.In this context, the required profile is as follows:A Ph.D. degree in Computer Science, advanced computing systems or related field, obtained less than 10 years ago;Demonstrated the ability to make original contributions to the state of the art in large-scale/hyperscale systems and infrastructures, in particular, on computational and storage resources management and efficiency.Having experience in leading or participating in research teams, in particular, in the context of collaborative research projects and the application project funding entities;Experience in teaching and supervising students in topics related to distributed infrastructures ( e.g., Cloud Computing, HPC) and large-scale systems;Experience in development and technology transfer activities towards achieving wider economic and societal impact.It is expected that the prospective employee is assigned the following tasks:contribution to research on aspects of computational and storage resources management and efficiency within the scope of distributed infrastructures ( e.g., Cloud Computing, HPC) and large-scale systems fundamentals, techniques, and tools;promotion of novel research and development opportunities that advance the state of the art and realize its impact in industry and society;supervision of junior researchers on topics related to computational and storage resources management and efficiency;development of topics related to computational and storage resources management and efficiency in the teaching curricula of distributed infrastructures ( e.g., Cloud Computing, HPC) and large-scale systems courses at bachelor and master levels.
Beneficiaries
The two types are::
- Direct Beneficiaries are those whose funding and projects to implement are part of the Recovery and Resilience Plan that has been negotiated and approved by the European Union;
- Final Beneficiaries are those whose funding and projects to implement are approved following a selection process through Calls for Applications.
Call for applications
As part of the Call for Applications, submissions are requested to select the projects and final beneficiaries to whom funding will be awarded. Specific selection criteria are defined for each call, which must be reflected in the applications submitted and assessed.
The project is appraised on the basis of its compliance with the selection criteria laid down in the calls for applications, and a final score may be awarded, where applicable.
Final evaluation score
The components for calculating the assessment score can be found in the selection criteria document mentioned below.
Selection criteria
Beneficiaries
Intermediate beneficiaries
Procurement
Beneficiaries representing public entities implement their project by signing one or more contracts with suppliers for goods or services through public procurement procedures.
To ensure and provide the utmost transparency in all these contracts, a list of the contracts that were signed under this project is available here, along with the information available on the Base.Gov platform. Please note that, according to the legislation in force at the time the contract was signed, some exceptions do not require the publication of the contracts signed on this platform, and, therefore, no information is available in such cases.
Geographic distribution
81,29 thousand €
Total amount of the project
Percentage of the amount already paid for implementing projects
, 100 %,Where was the money spent
By county
1 county financed .
-
Porto 81,29 thousand € ,