Leader: Mario Bochicchio (UniBA); Other collaborator(s): Giovanni Paragliola (ICAR-CNR)
In the perspective of the new models adopted for decentralized clinical trials, we will explore new solution based on privacy-preserving federated learning techniques and Generative Adversarial Networks to acquire patient data and process it on demand without violating the constraints imposed by GDPR
Brief description of the activities and of the intermediate results
Practices and tools adopted at the international/European/Italian level for collecting, sharing and processing clinical and wellness monitoring data were explored, with a focus on the new and promising class of privacy-friendly federated learning techniques. The analysis and testing of such techniques at different levels of granularity (individual subject/patient, physician, hospital, etc.) is aimed at the possible definition and validation of innovative and more flexible approaches for predictive health analysis.
Private data-based collaborative approach allows training ML models with no need to collect the subject’s data to the center for ML analysis. The decision model can be trained while preserving data privacy.
Different frameworks available based on the idea of a remote procedural call have been investigated: Substra, Flower, FedML, OpenFL and others. Especially Substra and OpenFL have been used for biomedical applications.
From the initial analysis it emerges that Substra is potentially a good solution:
- Substra is proven in real production environments (e.g. MELLODY and HealthChain project)
- Supports various FL scenarios, including horizontal FL, vertical FL, transfer FL
- Provides tools for data preprocessing, model evaluation, and secure aggregation of model updates.
- It supports both PyTorch and TensorFlow and can be deployed on-premise or cloud
Main policy, industrial and scientific implications
Federated learning could be implemented for robust privacy-preserving systems in health informatics.
Federated learning has been studied as an attractive solution to enable decentralized nodes to collectively train shared machine learning models without the need to transmit sensitive data to a central database.
In health informatics, the need for robust privacy-preserving mechanisms is critical, and it becomes particularly significant when dealing with predictive diagnosis and analysis in personalized medicine, precision medicine, risk stratification, and longitudinal monitoring. We explore the applications of federated learning frameworks in the context of cloud-edge in healthcare. We identify real-world settings to assess the benefits and challenges of personalized federated learning. These include issues of data imbalance, usability, promoting replicability, improving security, minimizing environmental impact (greenness), and optimizing overall efficiency.
In relation to multi-component interventions, we have initiated an analysis of Retrieval Augmented Generation (RAG) systems as components for the development of autonomous conversational agents to assist pre-frail and frail individuals for monitoring and cognitive stimulation purposes. The first results of the research were published at the international conference, SYNERGY 2024.
- In collaboration with project partners, the study of signal and data analysist techniques of interest has been deepened, focusing in particular on time series analysis using Convolutional Neural Networks. Special attention was given to the application of eXplainable AI (XAI) principles and techniques in the detection of arrhythmias in cardiology patients at high risk of heart failure. This activity is documented in a paper presented at the European Conference on Artificial Intelligence ECAI, Santiago de Compostela, October 19-24, 2024.
- In close collaboration with the Milano-Bicocca research group, the acquisition process for 250 Garmin Vivosmart 5 wearable devices was completed. These devices will be used to monitor pre-frail and frail patients in accordance with the protocol agreed upon with the clinical study project leaders. The acquisition procedures for other necessary devices and services to carry out the activities are ongoing.
- As a specific contribution to the project, the use of PrivacyPreserving Federated Learning (PPFL) techniques has been explored as a potential enabling element to achieve sufficient data quantities for the training of Machine Learning systems with certified quality, given that the data are related to patients under medical supervision. Specifically, we delved into the main issues and challenges of PPFL techniques (heterogeneous use of monitoring techniques and devices, non-uniform distribution of cases among participating centers, computational and communication overhead, convergence problems, etc.). We also analyzed the main frameworks available for research, aiming to start an in-depth experimentation and comparison between Substra and Flower. This activity is documented in Bochicchio, M., Zeleke, S. N. (2024, April). Personalized Federated Learning in Edge-Cloud Continuum for Privacy-Preserving Health Informatics: Opportunities and challenges.
- Lastly, we completed the selection process for research fellow Amin Tuni Gure, who began service in October 2024, and, through an international selection process, qualified PhD student Sileshi Nibret Zeleke to participate in a semester-long research internship focused on XAI and PPFL topics at Penn State University, USA, in coordination with Professor Fenglong Ma. This activity is also valuable for the dissemination of the research results achieved and the creation of international collaboration.
1. In close collaboration with the Milano-Bicocca research group , we contributed to the operation of the following Research Units:
- Policlinico of Bari
- IRCCS INRCA UOC Geriatrics of Cosenza
- Hospital "Annunziata" of Cosenza - UOC Geriatrics of Cosenza
- Presidio Ospedaliero Pugliese Ciaccio - SOC Geriatrics of Catanzaro
- AOU Careggi - SOD Geriatrics for Complexity Care of Florence
providing them with 50, 37, 13, 13 and 13 garmin devices, respectively, and actively participating in the development of operational procedures for the management and technical assistance to patients, mainly at the Policlinico of Bari center, during the daily synchronization of garmin devices with the centralized platform.
Similar contacts were initiated for the transfer of an additional 50 garmin devices at the Center in Naples.
2. We continued research activities on privacy preserving federated learning by investigating aspects of using the Substra and Flower experimental platforms in real-world clinical settings. This has enabled the establishment of fruitful research relationships with:
- the Data Science Lab (Prof. Fenglong Ma) of the College of Information Science and Technology, at Pennsylvania State University, on the topics of eXplainable AI (XAI) applied to federated learning. A joint publication is being prepared on this topic.
- the company Asclepyus s.r.l., with which joint research is underway on aspects of privacy preserving data discovery, which complement and complete the aspects of federated learning investigated by the research gropup at University of Bari.
3. In relation to multicomponent interventions, we continued the survey on Retrieval Augmented Generation (RAG) systems and agents based on Large Multimodal Models as components for the realization of autonomous conversational systems to be paired with pre-fragile and fragile subjects for cognitive monitoring and stimulation purposes.
The following activities were conducted during April-July 2025:
1. In collaboration with project partners, procedures for data acquisition from wearable devices were refined through daily verification and direct support of patients undergoing monitoring to ensure correct and consistent use of wearable devices and daily synchronization of devices with the centralized collection system throughout the survey period.
2. Continued the research activity on privacy preserving federated learning by investigating the aspects of using the Flower platform in real clinical settings and evaluating the performance of federated algorithms on anonymized datasets available on the Web. Specifically, Pending the availability of the data collected by the Project, the performance of the main federated learning techniques on datasets related to aging-related diseases in cardiology and gynecology was analyzed.
In addition, the collaboration with the company Asclepyus s.r.l.in relation to the aspects of privacy preserving data discovery has continued, arriving at the definition of a potentially adoptable architectural model to continue the data collection and analysis activities of Age-It even after the end of the project.
3. The constant dialogue between the technology team and the clinical team has highlighted the need to further investigate aspects related to the explainability of the adopted Machine Learning techniques. Therefore, after a state-of-the-art analysis of the main eXplainable AI (XAI) techniques adopted in the clinical setting, experimentation with feature-based XAI techniques was initiated, especially for chronic cardiologic diseases for which there are well-defined clinical protocols.
The Research Unit "Department of Computer Science - Bari" (hereinafter UdR-DIB) conducted the following activities during the period August-September 2025:
- In collaboration with project partners, technical support for the management of Garmin wearable devices at clinical research units continued. As regards the clinical research unit in Bari, the research group of the Department of Computer Science provided the technical support necessary to set up and reset the devices for the entire duration of the trial, also providing operational support to patients and caregivers to ensure the correct use of the devices at home (charging and wearing). Device synchronization procedures and patient support continued without problems until the end of August 2025. In early September, operations related to Garmin devices were completed.
- The Research Group continued its experimentation on the Flower platform for Privacy Preserving Federated Learning, analyzing anonymized public datasets relating to real patients. Specifically, electrocardiograms were analyzed using an automatic arrhythmia detection and classification system, and cranial MRI images were analyzed for the detection and classification of brain tumors. A dataset containing approximately 10,000 colposcopic images from various online sources was also created and is currently being labeled for the subsequent training of a machine learning model for the early diagnosis of cervical cancer.
- The collaboration with Asclepyus s.r.l. and the partners of the PRIN REDRAW project continued with the definition of a privacy-preserving data discovery protocol that integrates the standard approach to clinical federated learning and simplifies the conduct of federated learning campaigns between multiple clinical institutions.
- Experiments with explainable AI (XAI) techniques continued through the analysis of ECG tracings, even in non-IIDness situations, using large multimodal models to identify abnormal characteristics in electrocardiographic tracings and classify them accordingly. The results relating to explainability applied to federated learning are currently being evaluated and analyzed together with cardiologist researchers who are experts in the evaluation of ECG tracings.
- With regard to multi-component interventions, research continued into the use of conversational systems based on Large Multimodal Models for the administration of interventions to pre-frail and frail individuals, with the addition of a robotic component to integrate the capabilities of the systems currently reported in the literature with elements of assistive robotics and not just social robotics. The specifications for a feasibility study on the use of robots (Unitree Go2 and G1 models) as digital companions for pre-frail and frail individuals have been defined. An external consultancy contract worth approximately €100,000 is currently being activated to conduct this feasibility study.
The Research Unit "Department of Computer Science - Bari" (hereinafter UdR-DIB) conducted the following activities during the period October-December 2025:
1. In collaboration with EagleProjects S.p.A., the possibilities of employing next-generation commercial anthropomorphic and quadrupedal robots in multicomponent interventions for active and healthy aging were explored. Specifically, an in-depth technical-scientific analysis was conducted on the Unitree G1 (humanoid robot) and Unitree Go2 (quadrupedal robot) robotic platforms, evaluating their integration into a modular technological ecosystem oriented toward monitoring, cognitive stimulation, and rehabilitation of elderly subjects. The analysis covered architectural aspects, hardware and software specifications, Human-Robot Interaction (HRI) modalities, integration with IoMT (Internet of Medical Things) networks, and the use of advanced simulation environments (Isaac Sim, MuJoCo) for training artificial intelligence models. The sim-to-real development pipeline was also examined in depth, which is essential for transferring policies trained in simulation to physical robots, ensuring safety and operational reliability in real clinical settings.
2. The analysis and extension of the open-source Gadgetbridge software was completed, aimed at enabling data collection from smartwatch devices connected to Android smartphones. The work involved implementing functionalities for on-demand device synchronization, overcoming the limitations imposed by dependence on the Fitrockr platform and the proprietary Garmin SDK. This solution offers greater autonomy in monitoring data management and facilitates integration with decentralized data collection architectures, such as those based on federated learning. The software extension also allows for expanding the range of supported wearable devices, increasing operational flexibility in multi-brand data collection.
3. A prototype version of an iOS application based on Apple HealthKit was developed, capable of collecting and analyzing data available on smartwatches connected to Apple smartphones. The application enables the acquisition of vital parameters, physical activity data, and wellness metrics, with local pre-processing and analysis functionalities. The prototype constitutes a complementary element to the Android solution based on Gadgetbridge, ensuring coverage across both major mobile ecosystems and increasing the capacity for recruitment and monitoring of participants in the project's clinical studies.
- S. N. Zeleke, M. Bochicchio, "FedPerAda: Personalized Federated Learning via Local Adapters and Similarity-Aware Aggregation", 2025 IEEE International Conference on Big Data (BigData), Macau, SAR, China, 2025.
- A. T. Gure and M. A. Bochicchio, "Privacy-Preserving Federated Learning for Brain MRI Analysis: A Multi-Modal Approach", 2025 IEEE International Conference on Big Data (BigData), Macau, SAR, China, 2025.
- Bochicchio, M. A., Gure, A. T., & Zeleke, S. N. (2025, April). Machine Learning and Artificial Intelligence at the Edge: Federated Learning for Colposcopy Image Analysis.. In: Barolli, L. (eds) Advanced Information Networking and Applications. AINA 2025. Lecture Notes on Data Engineering and Communications Technologies, vol 250. Springer, Cham. https://doi.org/10.1007/978-3-031-87778-0_26.
- A. T. Gure and M. A. Bochicchio, "Privacy-Preserving Cervical Cancer Classification via Federated Learning on Medical Images," 2025 International Conference on Information and Communication Technology for Development for Africa (ICT4DA), Bahir Dar, Ethiopia, 2025, pp. 178-183, doi: 10.1109/ICT4DA67218.2025.11282703.
- Sileshi Nibret Zeleke, Mario Bochicchio, "Enabling Visual and Textual Explanation in Diagnostics: A Federated Learning Approach with Medical Vision-Language Models", XAI-Healthcare workshop co-located with AIME 2025, Pavia, Italy.
- Sileshi Nibret Zeleke, Mario Bochicchio,""Exploring the Potential of Medical Vision Language Models for ECG Interpretation", EXPLIMED (co-located with ECAI 2024), Bologna, Italy.
- Sileshi Nibret Zeleke, Mario Bochicchio ”CALM-ECG: Toward Accurate and Explainable ECG Analysis through Deep Learning and Vision-Language Model Integration" (under preparation, to be submitted).
- Nibret Zeleke, S., Fentie Jember, A., & Bochicchio, M. (2025). Integrating Explainable AI for Effective Malware Detection in Encrypted Network Traffic. arXiv e-prints, arXiv-2501.
- Bochicchio, M. A., Corciulo, S., Symbiosis and Synesthesia in Proactive Conversational Agents for Healthy Ageing, in Proceedings of the 1st International Workshop on Designing and Building Hybrid Human-AI Systems (SYNERGY 2024), Arenzano (Genoa), Italy, June 03, 2024, CEUR Workshop Proceedings. URL: https://ceur-ws.org/Vol-3701/paper10.pdf.
- Bochicchio, M., Zeleke, S.N. (2024). Personalized Federated Learning in Edge-Cloud Continuum for Privacy-Preserving Health Informatics: Opportunities and Challenges. In: Barolli, L. (eds) Advanced Information Networking and Applications. AINA 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 203. Springer, Cham. https://doi.org/10.1007/978-3-031-57931-8_36
- Zeleke, S. N., Bochicchio, M.Towards Explainable Federated Learning in Healthcare: A Focus on Heart Arrhythmia Detection. Proceedings of the First Workshop on Explainable Artificial Intelligence for the Medical Domain (EXPLIMED 2024) co-located with 27th European Conference on Artificial Intelligence (ECAI 2024) Santiago de Compostela, Spain, October 20, 2024.
- S. N. Zeleke, A. Fentie Jember and M. Bochicchio, "Encrypted Malicious Network Traffic Detection: Leveraging Attention Mechanism and Markov Chain Sequencing," 2024 International Conference on Information and Communication Technology for Development for Africa (ICT4DA), Bahir Dar, Ethiopia, 2024, pp. 148-153, doi: 10.1109/ICT4DA62874.2024.10777138.
- S. N. Zeleke and M. Bochicchio, "Federated Kolmogorov-Arnold Networks for Health Data Analysis: A Study Using ECG Signal," 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 2024, pp. 8070-8077, doi: 10.1109/BigData62323.2024.10825188.