- Automatic Sub-Task Focus: LifeInsight’s ContributionIn this paper, we introduce the automation approach of LifeInsight, a retrieval system designed explicitly for the NTCIR-17 Lifelog-5 Automatic Task, facilitating a seamless search experience and efficient data mining. Our method entails a two-fold process, where we first enrich the metadata from the raw query, followed by the composition of the retrieval method from input entitiesNTCIR-17, 2023
- Lifelog Discovery Assistant: Suggesting Prompts and Indexing Event Sequences for FIRSTIn this paper, we present the fourth iteration of our participating system FIRST. For this year, we adopt generative models to equip the system with predictive ability rather than entirely relying on the user to input the query. We also index a sequence of images as an event for improved search speed. Finally, we demonstrate how the additional features can assist users in searching.Proceedings of the 6th Annual ACM Lifelog Search Challenge, 2023
- ANIMAR: Text/Sketch-based 3D Animal Fine-Grained RetrievalThis paper presents a novel SHREC challenge track focusing on text-based fine-grained retrieval of 3D animal models. Unlike previous SHREC challenge tracks, the proposed task is considerably more challenging, requiring participants to develop innovative approaches to tackle the problem of text-based retrievalComputers & Graphics, 2023
- V-FIRST 2.0: Video Event Retrieval with Flexible Textual-Visual Intermediary for VBS 2023We present a new version of our interactive video retrieval system V-FIRST. Besides the existing features of querying by textual descriptions and visual examples, we propose the usage of an image generator that can generate images from a text prompt as a means to bridge the domain gap. We also include a novel referring expression segmentation module to highlight the objects in an image. This is the first step towards providing adequate explainability to retrieval results, ensuring that the system can be trusted and used in domain-specific and critical scenarios.MultiMedia Modeling: 29th International Conference, MMM 2023, Bergen, Norway, 2023
- Semi-supervised Organ Segmentation with Mask Propagation Refinement and Uncertainty Estimation for Data GenerationWe present a novel two-staged method that employs various 2D-based techniques to deal with the 3D segmentation task. In most of the previous challenges, it is unlikely for 2D CNNs to be comparable with other 3D CNNs since 2D models can hardly capture temporal information. In light of that, we propose using the recent state-of-the-art technique in video object segmentation, combining it with other semi-supervised training techniques to leverage the extensive unlabeled data...MICCAI 2022 Challenge, FLARE, 2022
- Text query based traffic video event retrieval with global-local fusion embeddingRetrieving event videos based on textual description is a promising research topic in the fast-growing data field. However, traffic data increases every day, so it is essential to need intelligent traffic system management in conjunction with humans to speed up the search. We propose a multi-module system that delivers accurate results that meet objectives, including explainability and scalability at the same time...IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
- Pothole and crack detection in the road pavement using images and RGB-D dataThis paper describes the methods submitted for evaluation to the SHREC 2022 track on pothole and crack detection in the road pavement. A total of 7 different runs for the semantic segmentation of the road surface are compared, 6 from the participants plus a baseline method. All methods exploit Deep Learning techniques and their performance is tested using the same environment ...Computers & Graphics, 2022
- Facial Data De-identification with Adversarial Generation and Perturbation MethodsThe 2021 MediaEval Multimedia Evaluation introduces a new data de-identification task, which goal is to explore methods for obscuring driver identity in driver-facing video recordings while maintaining visible human behavioral information. Interested in the challenge, our HCMUS team participate in searching for different ideas to tackle the problem...MediaEval, 2021
- Retrieval of cultural heritage objectsThis paper presents the methods and results of the SHREC’21 track on a dataset of cultural heritage (CH) objects. We present a dataset of 938 scanned models that have varied geometry and artistic styles. For the competition, we propose two challenges: the retrieval-by-shape challenge and the retrieval-by-culture challenge. The former aims at evaluating the ability of retrieval methods to discriminate cultural heritage objects by overall shape...Computers & Graphics, 2021
- Efficient methods of Metadata Embedding and Augmentation for Visual Sentiment AnalysisThe Visual Sentiment Analysis Task which is the new task in The Multimedia Evaluation 2021 Challenge concentrates on recognizing emotional responses to natural disaster images. Our team performs different approaches based on multiple pretrained models and many techniques to deal with 3 subtasks having a different set of labels for each one...MediaEval, 2021
- Image-Text Fusion for Automatic News-Images Re-MatchingMatching text and images based on their semantics has an important role in cross-media retrieval. Especially, in terms of news, text and images connection is highly ambiguous. In the context of MediaEval 2020 Challenge, we propose three multi-modal methods for mapping text and images of news articles to the shared space in order to perform efficient cross-retrieval...MediaEval, 2020
- Extended Monocular Image Based 3D Model RetrievalMonocular image based 3D object retrieval has attracted more and more attentions in the field of 3D object retrieval. However, the research of 3D object retrieval based on 2D image is still challenging, mainly because of the gap between data from different modalities. To further support this research, we extend the previous track SHREC19'MI3DOR to organize this track, and we construct the expanded monocular image based 3D object retrieval benchmark...The Eurographics Association, 2020
- Scene Category Protection with Back Propagation and Image EnhancementPersonal privacy is one of the essential problems in modern society. In some cases, people may not want smart computing systems to automatically identify and reveal their personal information, such as places or habits. This motivates our proposal to protect scene category recognition from photos by back-propagation...MediaEval, 2019
- Multimodal Personal Health Lifelog Data Analysis with Inference from Multiple Sources and AttributesWhen collecting and processing data recorded by sensors for any applications, noisy and missing data is an important problem that need to be address. This paper presents two approaches we use to predict missing air quality data ...MediaEval, 2019
2023
2022
2021
2020
2019