MORE4AIO

Threat Monitoring And Remediation for Artificial Intelligence Systems in Operation

Duration: September 2024 - September 2027

Researchers

de Fuentes, José María (PI)
González-Manzano, Lorena (PI)
Anciaux, Nicolas
García-Alfaro, Joaquín
González-Tablas, Ana Isabel
Pastrana, Sergio
Tejerina, Ofelia

This project has been provisionally accepted for funding within the «Proyectos de Generación de Conocimiento» call

Summary

AI is used for assorted purposes such as users recommendations or fraud detection, but problems of the use of AI are currently considered as a major threat to the society, as pointed out in the EU regulation cited above. In this regard, paying attention to cybersecurity in the use of AI is a demand, specially avoiding problems in systems’ operation processes for being the last step in the development lifecycle and thus, where outputs to users are provided. The starting hypothesis can be written as follows – is it possible to effectively monitor artificial intelligence systems and apply remediations while in operation? Remarkably, “designing methods that constantly monitor if a deployed model is under attack during operation, enabling prompt reaction when needed” has been pointed out as a challenging matter by Maiorca et al.

There are two main challenges in dealing with systems in operation. On the one hand, developing real-time attack countermeasures considering the complexity and diversity of AI novel models. On the other hand, the difficulty that explainability involves, not just as a quantitative level, but developing metrics to allow its measurement and mitigating strategies to keep the explainability robustness when models are under attack. In this way, a methodology to assess the quality of explanations and multiple mitigation mechanisms, together with experimental datasets, will be developed and released to help developers and researchers to strengthen the last step of the whole development workflow. In sum, the pursued goals involve the management of users’ abuses in AI systems and attainability of explainable robust outputs.

Goals

MORE4AIO has three general objectives:

GO1 Identify, measure, and characterize the severity of threats in state-of-the-art AI models that are in operation to enhance the monitoring step, including explanations and interactions with its users.

GO2 Develop methods and tools for mitigating user abuses to AI models in operation

GO3 Develop methods and tools for mitigating explainability threats in AI models in operation

Publications

2025

Buitrago-Perez, Marina; de Fuentes, José María; Tejerina, Ofelia; Anciaux, Nicolas; González-Manzano, Lorena; Ibanez-Lissen, Luis. XAI for safe AI: technical-legal misconceptions and misalignments. Proceedings Jornadas Nacionales de Investigacion en Ciberseguridad (JNIC). JNIC.

2025

Ibanez-Lissen, Luis; González-Manzano, Lorena; de Fuentes, José María; Anciaux, Nicolas. LPASS: Linear Probes as Stepping Stones for vulnerability detection using compressed LLMs. Journal of Information Security and Applications.

2025

Zhang, Haoying; Brahem, Mariem; Anciaux, Nicolas; Nguyen, Benjamin; de Fuentes, José María. TELESAFE: Detecting Private/Work Boundary Crossings in Energy Consumption Trails in Telework. Proceedings Very Large Databases (VLDB) 2025. VLDB.

This project has received funding from Agencia Estatal de Investigación under the project PID2023-150310OB-I00.

Published on Friday, Jul 26, 2024 Last Modified on Tuesday, Jul 30, 2024