Paolo Rota
Temporal Grounding for Video Understanding |
|
()
venerdì 18 ottobre 2024
This thesis explores temporal grounding techniques for video understanding, focusing on accurately identifying and localizing specific actions or events within untrimmed video streams. The research aims to develop methods that align semantic concepts with precise temporal segments, addressing challenges such as temporal ambiguity, multi-action overlap, and real-world variability. Potential applications include video summarization, activity recognition, and event detection in domains like sports analysis, surveillance, and human-computer interaction. The project will involve building upon state-of-the-art deep learning architectures and experimenting with temporal attention mechanisms, sequence models, and multimodal approaches. |
|
Bias analysis and mitigation in VLMs (case study: Dementia) |
|
()
giovedì 26 settembre 2024
Visual Language Models (VLMs) are essential in the creation of digital visual content, but their effectiveness can be compromised by biases in the training data, potentially reinforcing stigma around certain topics, such as dementia. Our study explores the perception of biases in VLMs related to dementia, aiming to promote the development of more inclusive VLMs by analyzing and quantifying these biases and proposing techniques to mitigate their effects. |