Leader: Mario Mezzanzanica (Università Milano Bicocca); Other collaborator(s): Fabio Mercorio (Università Milano Bicocca)
This WP relies on sources identified in WP 5.2 to implement the architecture to collect big data from identified web sources to be transformed into smart statistics. To guarantee the reliability of statistics over time (e.g., stability of time-series collected, quality of web source, popularity and usage of web sources by people), this WP will
- provide a quality ranking of identified web sources
- implement the data collection pipeline,
- perform analytics on collected web data and,
- compare and estimate if -and to what extent- smart statistics provide added value to observe phenomena with respect to official statistics.
Brief description of the activities and of the intermediate results
From November 2023 to March 2024, we focused on two closely related research lines. Firstly, we explored the design and definition of a data model essential for navigating, querying, and connecting indicators and data to be integrated into the Age-it data lakehouse architecture. Our research primarily analyzed data gathered from a variety of sources, including reports, surveys, and studies from international organizations, national institutes, research centers, non-profit organizations, and the academic community, all relating to aging indicators. We used semantic modeling techniques to represent these indicators as nodes in the Knowledge Graph, capturing their interrelationships and domain-specific characteristics. Secondly, we completed and released a data survey to collect and analyze data features to be included in the data lakehouse.
Main policy, industrial and scientific implications
By adopting a graph-based approach, we have developed a Knowledge Graph for aging population indicators, providing both a visual and analytical tool to explore how demographic trends, health outcomes, socioeconomic factors, and other aging-related dimensions are interconnected. This graph-based representation simplifies the identification of key indicators and their impact on aging complexities. Through examining the connections between various indicators, we can uncover complex interdependencies crucial for understanding aging dynamics. This method also enables us to assess the integration of multiple disciplines in relation to aging trends.
- Tool Development on “Synthetic Data Generation Methods” (target: experts in various fields): Work has commenced on the development of a tool aimed at helping experts across different domains select the most appropriate synthetic data generation method.
- Interviews and Data Collection: Inter-spoke interviews have been conducted with WP-leaders and task leaders regarding the primary indicators used within their projects. The results will be utilized in the development of a dashboard aimed at a wide audience and the creation of a data lakehouse.
- Graph Expansion: The spoke-indicators graph has been expanded to include both scientific publications and institutional connections and interactions.
- Policy Brief: A policy brief related to synthetic data generation methodologies and their selection process will be developed based on ongoing work.
- Dissemination Activities: Dissemination efforts are ongoing as the project continues to engage with stakeholders and the wider scientific community.
This quarter, key milestones were achieved across several initiatives:
- Presented at The Demography of Ageing: State-of-the-Art and Challenges workshop in Rome, contributing to discussions on advancements in the field.
- Initiated collaboration with ISTAT on synthetic data generation, focusing on developing robust methodologies for statistical applications.
- Finalized PrivGen methodology, a human-in-the-loop method for generating private synthetic data, balancing data utility and privacy.
- Conducted multiple interviews with Spoke 1 to document the data in use, metrics generated, and their implications. These interviews also served as a foundation for developing a proof of concept (PoC) for a dashboard aimed at disseminating project knowledge to the general public.
- Based on the submissions of scientific outputs for the Age-It General Meeting in Venice, data analysis was performed to design a database and develop a web-based tool that summarizes the project's scientific contributions, enabling advanced analytics to enhance the visibility of contributors, thematic focus, and core ideas related to aging.
- Progressed toward finalizing a platform for internal collaboration, reporting, and metric analysis, providing an integrated tool for research coordination and evaluation.
Coming soon