The IKR3 laboratory is involved in numerous projects that address issues related to Information Retrieval, Natural Language Processing, Artificial Intelligence, and Social Computing.
AMAR: Adaptive Models for context-Aware Representation and understanding of multimedia content
The AMAR project is funded by the European Union – Next Generation EU within the project NRPP M4C2, Investment 1.,3 DD. 341 – 15 march 2022 – FAIR – Future Artificial Intelligence Research – Spoke 4 – PE00000013 – D53C22002380006.
The main purpose of the project is to study the theoretical foundations of multimedia content adaptivity to context via approaches based on neurosymbolic methods, having computer vision and natural language processing as privileged application domains. New approaches will be studied to combine multiple representations of multimedia information and to understand and model user interactions through multimodal channels like language, audio, and vision in various contexts and application domains. For this purpose, we will design methodologies capable of injecting contextual information and adapting to diverse representations and modalities. Knowledge injection into neural models will be implemented either as an implicit manipulation of the latent space or with a post-hoc update of the output. It will help to account for the context under consideration and mitigate the risk of generating inadequate output while allowing users to scrutinize it. Detecting relevant use patterns, context awareness and adaptivity, output interpretability and explainability are the main pillars of our approach. In particular, solutions to guarantee adaptive explanations targeted toward the intended recipient will be explored. Furthermore, we aim to reduce the amount of resources (time, memory, and energy) used in training, fine-tuning, and deploying the predictive and generative models. All the results, including data sets and experiments generated, will be made publicly available.
MoT – The Measure of Truth: An Evaluation-Centered Machine-Human Hybrid Framework for Assessing Information Truthfulness
The MoT project is funded by the Italian Ministry of Research under the PRIN 2022 call with two research units involved: the University of Udine (Prof. Stefano Mizzaro) and the University of Milano-Bicocca (Prof. Gabriella Pasi).
In recent years, the proliferation of distinct forms of false information online has raised the challenge of verifying the truthfulness of such content. To this aim, the scientific community has developed computational approaches (based on machine and deep learning) as well as human-in-the-loop solutions (based on crowdsourcing). The MoT project aims at defining:
- A novel evaluation framework to assess the effectiveness of approaches aimed at detecting information truthfulness;
- Novel hybrid solutions that combine state-of-the-art automatic and human-in-the-loop approaches.
KURAMi: Knowledge-based, explainable User empowerment in Releasing private data and Assessing Misinformation in online environments
The KURAMi project is funded by the Italian Ministry of Research under the PRIN 2022 call with three research units involved: the University of Turin (Prof. Luigi Di Caro), the University of Milan (La Statale) (Prof. Giovanni Livraga) and the University of Milano-Bicocca (Prof. Marco Viviani).
The KURAMi Project stems from observing recent dynamics in considering the protection of users’ data and privacy, and users’ access to potential misinformation, through EU regulations and guidelines. A key aspect is the balance between the rights to confidentiality, autonomous decision-making, free access to information, and freedom of expression. To do this, the Project aims at:
- Define explainable knowledge-based solutions enabling users to: (i) specify and have enforced protection requirements over their personal/sensitive data and (ii) assess the risk of personal/sensitive information disclosure associated with the released data;
- Define knowledge-based and explainable solutions enabling users to: (i) access genuine content in an environment of free access to information and freedom of expression, and (ii) be informed of how the process of assessing the genuineness of information has unfolded;
- Investigate the relationship between the release of personal/sensitive data and access to potential misinformation, both before and after the adoption of the proposed solutions.
https://kurami.disco.unimib.it/
LeMuR: Learning with Multiple Representations
LeMuR is an MSCA (Marie Skłodowska-Curie Actions) Doctoral Network (DN) 2021 on Learning with Multiple Representations. The goal of LeMuR is to develop the theoretical foundations and a first set of algorithms for the new “Learning with Multiple Representations” (LMR) paradigm. Moreover, corresponding applications will be developed to demonstrate the usefulness of the new family of approaches. Specifically, LMR algorithms will allow flexible representations (e.g., suitable for explainability, fairness, …) with diverse target functions (e.g., incorporating environmental or even social impact) so as to make the induced models abide by the Green Charter and trustworthy AI criteria by design. The project will focus on learning with weak supervision because it addresses one of the major flaws of modern ML approaches, i.e., their data hunger, by means of weaker sources of labeling for training data. The outcome of the DN will be a set of 10 experts trained to implement the third and subsequent waves of AI in Europe. The highly interdisciplinary and intersectoral context in which they will be trained will provide them with research-related and transferable competencies relevant to successful careers in central AI areas.
PINPOINT: exPlaInable kNowledge-aware PrOcess INTelligence
The PINPOINT Project, funded by the Italian Ministry of Research under the PRIN 2021 call, aims to develop methods for including background knowledge and general intelligence in business process models. To achieve this goal, the project will develop new representation languages based on temporal specifications, and develop neural models for predicting future behavior, especially in the presence of uncertainty and adaptive behavior by users. A challenge for the development of these methods is to keep them interpretable and to provide meaningful explanations for the decisions and choices made by a system to any interested party. The project results are expected to be applied in two settings within the medical and telecommunications domains.
The Project involves the Free University of Bozen-Bolzano, the University La Sapienza of Rome, the University of Calabria, the University of Milano-Bicocca (Prof. Rafael Peñaloza), and the Italian National Research Council.
DoSSIER: Domain-Specific Systems for Information Extraction and Retrieval
DoSSIER is an EU Horizon 2020 ITN/ETN on Domain-Specific Systems for Information Extraction and Retrieval. DoSSIER will elucidate, model, and address the different information needs of professional users. It mobilizes an excellent and highly synergistic team of world-leading Information Retrieval (IR) experts from 5 EU States who, together with 3 academic partners (universities in the US, Japan, and Australia), and 11 industrial partners (dynamic SMEs and large corporations) will produce fundamental insights into how users comprehend, formulate, and access information in professional environments.
The PerLIR project: Personal Linguistic resources in Information Retrieval
The PerLIR project is funded by the Italian Ministry of Research under the PRIN 2019 call with two research units involved: the University of Milano-Bicocca (Prof. Gabriella Pasi) and the Sapienza University of Rome (Prof. Roberto Navigli, head of the Linguistic Computing Laboratory (http://lcl.uniroma1.it, Department of Computer Science).
The aim of the project is to provide groundbreaking techniques able to bridge the gap between Information Retrieval and multilingual Natural Language Processing to innovate:
- The way a user model is created, thanks to the automatic creation of language-independent personal linguistic resources.
- The exploitation of personal linguistic resources in Information Retrieval to show the benefit of a language-independent, scalable user representation (which is at the same time customized to both the user preferences and her language usage) in retrieving the most relevant results to that user’s queries.