Leader: Mario Bochicchio (UniBA); Other collaborator(s): Giovanni Paragliola (ICAR-CNR)
In the perspective of the new models adopted for decentralized clinical trials, we will explore new solution based on privacy-preserving federated learning techniques and Generative Adversarial Networks to acquire patient data and process it on demand without violating the constraints imposed by GDPR
Brief description of the activities and of the intermediate results
Practices and tools adopted at the international/European/Italian level for collecting, sharing and processing clinical and wellness monitoring data were explored, with a focus on the new and promising class of privacy-friendly federated learning techniques. The analysis and testing of such techniques at different levels of granularity (individual subject/patient, physician, hospital, etc.) is aimed at the possible definition and validation of innovative and more flexible approaches for predictive health analysis.
Private data-based collaborative approach allows training ML models with no need to collect the subject’s data to the center for ML analysis. The decision model can be trained while preserving data privacy.
Different frameworks available based on the idea of a remote procedural call have been investigated: Substra, Flower, FedML, OpenFL and others. Especially Substra and OpenFL have been used for biomedical applications.
From the initial analysis it emerges that Substra is potentially a good solution:
- Substra is proven in real production environments (e.g. MELLODY and HealthChain project)
- Supports various FL scenarios, including horizontal FL, vertical FL, transfer FL
- Provides tools for data preprocessing, model evaluation, and secure aggregation of model updates.
- It supports both PyTorch and TensorFlow and can be deployed on-premise or cloud
Main policy, industrial and scientific implications
Federated learning could be implemented for robust privacy-preserving systems in health informatics.
Federated learning has been studied as an attractive solution to enable decentralized nodes to collectively train shared machine learning models without the need to transmit sensitive data to a central database.
In health informatics, the need for robust privacy-preserving mechanisms is critical, and it becomes particularly significant when dealing with predictive diagnosis and analysis in personalized medicine, precision medicine, risk stratification, and longitudinal monitoring. We explore the applications of federated learning frameworks in the context of cloud-edge in healthcare. We identify real-world settings to assess the benefits and challenges of personalized federated learning. These include issues of data imbalance, usability, promoting replicability, improving security, minimizing environmental impact (greenness), and optimizing overall efficiency.
In relation to multi-component interventions, we have initiated an analysis of Retrieval Augmented Generation (RAG) systems as components for the development of autonomous conversational agents to assist pre-frail and frail individuals for monitoring and cognitive stimulation purposes. The first results of the research were published at the international conference, SYNERGY 2024.
- In collaboration with project partners, the study of signal and data analysist techniques of interest has been deepened, focusing in particular on time series analysis using Convolutional Neural Networks. Special attention was given to the application of eXplainable AI (XAI) principles and techniques in the detection of arrhythmias in cardiology patients at high risk of heart failure. This activity is documented in a paper presented at the European Conference on Artificial Intelligence ECAI, Santiago de Compostela, October 19-24, 2024.
- In close collaboration with the Milano-Bicocca research group, the acquisition process for 250 Garmin Vivosmart 5 wearable devices was completed. These devices will be used to monitor pre-frail and frail patients in accordance with the protocol agreed upon with the clinical study project leaders. The acquisition procedures for other necessary devices and services to carry out the activities are ongoing.
- As a specific contribution to the project, the use of PrivacyPreserving Federated Learning (PPFL) techniques has been explored as a potential enabling element to achieve sufficient data quantities for the training of Machine Learning systems with certified quality, given that the data are related to patients under medical supervision. Specifically, we delved into the main issues and challenges of PPFL techniques (heterogeneous use of monitoring techniques and devices, non-uniform distribution of cases among participating centers, computational and communication overhead, convergence problems, etc.). We also analyzed the main frameworks available for research, aiming to start an in-depth experimentation and comparison between Substra and Flower. This activity is documented in Bochicchio, M., Zeleke, S. N. (2024, April). Personalized Federated Learning in Edge-Cloud Continuum for Privacy-Preserving Health Informatics: Opportunities and challenges.
- Lastly, we completed the selection process for research fellow Amin Tuni Gure, who began service in October 2024, and, through an international selection process, qualified PhD student Sileshi Nibret Zeleke to participate in a semester-long research internship focused on XAI and PPFL topics at Penn State University, USA, in coordination with Professor Fenglong Ma. This activity is also valuable for the dissemination of the research results achieved and the creation of international collaboration.
1. In close collaboration with the Milano-Bicocca research group , we contributed to the operation of the following Research Units:
- Policlinico of Bari
- IRCCS INRCA UOC Geriatrics of Cosenza
- Hospital "Annunziata" of Cosenza - UOC Geriatrics of Cosenza
- Presidio Ospedaliero Pugliese Ciaccio - SOC Geriatrics of Catanzaro
- AOU Careggi - SOD Geriatrics for Complexity Care of Florence
providing them with 50, 37, 13, 13 and 13 garmin devices, respectively, and actively participating in the development of operational procedures for the management and technical assistance to patients, mainly at the Policlinico of Bari center, during the daily synchronization of garmin devices with the centralized platform.
Similar contacts were initiated for the transfer of an additional 50 garmin devices at the Center in Naples.
2. We continued research activities on privacy preserving federated learning by investigating aspects of using the Substra and Flower experimental platforms in real-world clinical settings. This has enabled the establishment of fruitful research relationships with:
- the Data Science Lab (Prof. Fenglong Ma) of the College of Information Science and Technology, at Pennsylvania State University, on the topics of eXplainable AI (XAI) applied to federated learning. A joint publication is being prepared on this topic.
- the company Asclepyus s.r.l., with which joint research is underway on aspects of privacy preserving data discovery, which complement and complete the aspects of federated learning investigated by the research gropup at University of Bari.
3. In relation to multicomponent interventions, we continued the survey on Retrieval Augmented Generation (RAG) systems and agents based on Large Multimodal Models as components for the realization of autonomous conversational systems to be paired with pre-fragile and fragile subjects for cognitive monitoring and stimulation purposes.
- Bochicchio, M. A., Gure, A. T., & Zeleke, S. N. (2025, April). Machine Learning and Artificial Intelligence at the Edge: Federated Learning for Colposcopy Image Analysis. In International Conference on Advanced Information Networking and Applications (pp. 262-272). Cham: Springer Nature Switzerland.
- Nibret Zeleke, S., Fentie Jember, A., & Bochicchio, M. (2025). Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic. arXiv e-prints, arXiv-2501.
- Bochicchio, M. A., Corciulo, S., Symbiosis and Synesthesia in Proactive Conversational Agents for Healthy Ageing, in Proceedings of the 1st International Workshop on Designing and Building Hybrid Human-AI Systems (SYNERGY 2024), Arenzano (Genoa), Italy, June 03, 2024, CEUR Workshop Proceedings. URL: https://ceur-ws.org/Vol-3701/paper10.pdf.
- Bochicchio, M., Zeleke, S.N. (2024). Personalized Federated Learning in Edge-Cloud Continuum for Privacy-Preserving Health Informatics: Opportunities and Challenges. In: Barolli, L. (eds) Advanced Information Networking and Applications. AINA 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 203. Springer, Cham. https://doi.org/10.1007/978-3-031-57931-8_36
- Zeleke, S. N., Bochicchio, M.Towards Explainable Federated Learning in Healthcare: A Focus on Heart Arrhythmia Detection. Proceedings of the First Workshop on Explainable Artificial Intelligence for the Medical Domain (EXPLIMED 2024) co-located with 27th European Conference on Artificial Intelligence (ECAI 2024) Santiago de Compostela, Spain, October 20, 2024.
- S. N. Zeleke, A. Fentie Jember and M. Bochicchio, "Encrypted Malicious Network Traffic Detection: Leveraging Attention Mechanism and Markov Chain Sequencing," 2024 International Conference on Information and Communication Technology for Development for Africa (ICT4DA), Bahir Dar, Ethiopia, 2024, pp. 148-153, doi: 10.1109/ICT4DA62874.2024.10777138.
- S. N. Zeleke and M. Bochicchio, "Federated Kolmogorov-Arnold Networks for Health Data Analysis: A Study Using ECG Signal," 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 2024, pp. 8070-8077, doi: 10.1109/BigData62323.2024.10825188.