Search SciRate

7 results for au:Costa_W in:cs

Show all abstracts

Indoor scene recognition from images under visual corruptions
Willams de Lima Costa, Raul Ismayilov, Nicola Strisciuglio, Estefania Talavera Martinez
Aug 26 2024 cs.CV arXiv:2408.13029v1

@misc{2408.13029, author = {Willams de Lima Costa and Raul Ismayilov and Nicola Strisciuglio and Estefania Talavera Martinez}, title = {{I}ndoor scene recognition from images under visual corruptions}, year = {2024}, eprint = {2408.13029}, note = {arXiv:2408.13029v1} }
PDF
The classification of indoor scenes is a critical component in various applications, such as intelligent robotics for assistive living. While deep learning has significantly advanced this field, models often suffer from reduced performance due to image corruption. This paper presents an innovative approach to indoor scene recognition that leverages multimodal data fusion, integrating caption-based semantic features with visual data to enhance both accuracy and robustness against corruption. We examine two multimodal networks that synergize visual features from CNN models with semantic captions via a Graph Convolutional Network (GCN). Our study shows that this fusion markedly improves model performance, with notable gains in Top-1 accuracy when evaluated against a corrupted subset of the Places365 dataset. Moreover, while standalone visual models displayed high accuracy on uncorrupted images, their performance deteriorated significantly with increased corruption severity. Conversely, the multimodal models demonstrated improved accuracy in clean conditions and substantial robustness to a range of image corruptions. These results highlight the efficacy of incorporating high-level contextual information through captions, suggesting a promising direction for enhancing the resilience of classification systems.
BoFire: Bayesian Optimization Framework Intended for Real Experiments
Johannes P. Dürholt, Thomas S. Asche, Johanna Kleinekorte, Gabriel Mancino-Ball, Benjamin Schiller, Simon Sung, Julian Keupp, Aaron Osburg, Toby Boyne, Ruth Misener, Rosona Eldred, Wagner Steuer Costa, Chrysoula Kappatou, Robert M. Lee, Dominik Linzner, David Walz, Niklas Wulkow, Behrang Shafei
Aug 12 2024 cs.LG math.OC stat.ML arXiv:2408.05040v1

@misc{2408.05040, author = {Johannes P.~Dürholt and Thomas S.~Asche and Johanna Kleinekorte and Gabriel Mancino-Ball and Benjamin Schiller and Simon Sung and Julian Keupp and Aaron Osburg and Toby Boyne and Ruth Misener and Rosona Eldred and Wagner Steuer Costa and Chrysoula Kappatou and Robert M.~Lee and Dominik Linzner and David Walz and Niklas Wulkow and Behrang Shafei}, title = {{B}o{F}ire: {B}ayesian {O}ptimization {F}ramework {I}ntended for {R}eal {E}xperiments}, year = {2024}, eprint = {2408.05040}, note = {arXiv:2408.05040v1} }
PDF
Our open-source Python package BoFire combines Bayesian Optimization (BO) with other design of experiments (DoE) strategies focusing on developing and optimizing new chemistry. Previous BO implementations, for example as they exist in the literature or software, require substantial adaptation for effective real-world deployment in chemical industry. BoFire provides a rich feature-set with extensive configurability and realizes our vision of fast-tracking research contributions into industrial use via maintainable open-source software. Owing to quality-of-life features like JSON-serializability of problem formulations, BoFire enables seamless integration of BO into RESTful APIs, a common architecture component for both self-driving laboratories and human-in-the-loop setups. This paper discusses the differences between BoFire and other BO implementations and outlines ways that BO research needs to be adapted for real-world use in a chemistry setting.
VWise: A novel benchmark for evaluating scene classification for vehicular applications
Pedro Azevedo, Emanuella Araújo, Gabriel Pierre, Willams de Lima Costa, João Marcelo Teixeira, Valter Ferreira, Roberto Jones, Veronica Teichrieb
Jun 06 2024 cs.CV arXiv:2406.03273v1

@misc{2406.03273, author = {Pedro Azevedo and Emanuella Araújo and Gabriel Pierre and Willams de Lima Costa and João Marcelo Teixeira and Valter Ferreira and Roberto Jones and Veronica Teichrieb}, title = {{VW}ise: {A} novel benchmark for evaluating scene classification for vehicular applications}, year = {2024}, eprint = {2406.03273}, note = {arXiv:2406.03273v1} }
PDF
Current datasets for vehicular applications are mostly collected in North America or Europe. Models trained or evaluated on these datasets might suffer from geographical bias when deployed in other regions. Specifically, for scene classification, a highway in a Latin American country differs drastically from an Autobahn, for example, both in design and maintenance levels. We propose VWise, a novel benchmark for road-type classification and scene classification tasks, in addition to tasks focused on external contexts related to vehicular applications in LatAm. We collected over 520 video clips covering diverse urban and rural environments across Latin American countries, annotated with six classes of road types. We also evaluated several state-of-the-art classification models in baseline experiments, obtaining over 84% accuracy. With this dataset, we aim to enhance research on vehicular tasks in Latin America.
ST-Gait++: Leveraging spatio-temporal convolutions for gait-based emotion recognition on videos
Maria Luísa Lima, Willams de Lima Costa, Estefania Talavera Martinez, Veronica Teichrieb
May 24 2024 cs.CV arXiv:2405.13903v1

@misc{2405.13903, author = {Maria Luísa Lima and Willams de Lima Costa and Estefania Talavera Martinez and Veronica Teichrieb}, title = {{ST}-{G}ait++: {L}everaging spatio-temporal convolutions for gait-based emotion recognition on videos}, year = {2024}, eprint = {2405.13903}, note = {arXiv:2405.13903v1} }
PDF
Emotion recognition is relevant for human behaviour understanding, where facial expression and speech recognition have been widely explored by the computer vision community. Literature in the field of behavioural psychology indicates that gait, described as the way a person walks, is an additional indicator of emotions. In this work, we propose a deep framework for emotion recognition through the analysis of gait. More specifically, our model is composed of a sequence of spatial-temporal Graph Convolutional Networks that produce a robust skeleton-based representation for the task of emotion classification. We evaluate our proposed framework on the E-Gait dataset, composed of a total of 2177 samples. The results obtained represent an improvement of approximately 5% in accuracy compared to the state of the art. In addition, during training we observed a faster convergence of our model compared to the state-of-the-art methodologies.
Leveraging Previous Facial Action Units Knowledge for Emotion Recognition on Faces
Pietro B. S. Masur, Willams Costa, Lucas S. Figueredo, Veronica Teichrieb
Nov 21 2023 cs.CV cs.AI cs.LG arXiv:2311.11980v1

@misc{2311.11980, author = {Pietro B.~S.~Masur and Willams Costa and Lucas S.~Figueredo and Veronica Teichrieb}, title = {{L}everaging {P}revious {F}acial {A}ction {U}nits {K}nowledge for {E}motion {R}ecognition on {F}aces}, year = {2023}, eprint = {2311.11980}, note = {arXiv:2311.11980v1} }
PDF
People naturally understand emotions, thus permitting a machine to do the same could open new paths for human-computer interaction. Facial expressions can be very useful for emotion recognition techniques, as these are the biggest transmitters of non-verbal cues capable of being correlated with emotions. Several techniques are based on Convolutional Neural Networks (CNNs) to extract information in a machine learning process. However, simple CNNs are not always sufficient to locate points of interest on the face that can be correlated with emotions. In this work, we intend to expand the capacity of emotion recognition techniques by proposing the usage of Facial Action Units (AUs) recognition techniques to recognize emotions. This recognition will be based on the Facial Action Coding System (FACS) and computed by a machine learning system. In particular, our method expands over EmotiRAM, an approach for multi-cue emotion recognition, in which we improve over their facial encoding module.
High-Level Context Representation for Emotion Recognition in Images
Willams de Lima Costa, Estefania Talavera Martinez, Lucas Silva Figueiredo, Veronica Teichrieb
May 08 2023 cs.CV cs.HC arXiv:2305.03500v1

@misc{2305.03500, author = {Willams de Lima Costa and Estefania Talavera Martinez and Lucas Silva Figueiredo and Veronica Teichrieb}, title = {{H}igh-{L}evel {C}ontext {R}epresentation for {E}motion {R}ecognition in {I}mages}, year = {2023}, eprint = {2305.03500}, note = {arXiv:2305.03500v1} }
PDF
Emotion recognition is the task of classifying perceived emotions in people. Previous works have utilized various nonverbal cues to extract features from images and correlate them to emotions. Of these cues, situational context is particularly crucial in emotion perception since it can directly influence the emotion of a person. In this paper, we propose an approach for high-level context representation extraction from images. The model relies on a single cue and a single encoding stream to correlate this representation with emotions. Our model competes with the state-of-the-art, achieving an mAP of 0.3002 on the EMOTIC dataset while also being capable of execution on consumer-grade hardware at approximately 90 frames per second. Overall, our approach is more efficient than previous models and can be easily deployed to address real-world problems related to emotion recognition.
Multi-Cue Adaptive Emotion Recognition Network
Willams Costa, David Macêdo, Cleber Zanchettin, Lucas S. Figueiredo, Veronica Teichrieb
Nov 04 2021 cs.CV cs.HC cs.MM arXiv:2111.02273v2

@misc{2111.02273, author = {Willams Costa and David Macêdo and Cleber Zanchettin and Lucas S.~Figueiredo and Veronica Teichrieb}, title = {{M}ulti-{C}ue {A}daptive {E}motion {R}ecognition {N}etwork}, year = {2021}, eprint = {2111.02273}, note = {arXiv:2111.02273v2} }
PDF
Expressing and identifying emotions through facial and physical expressions is a significant part of social interaction. Emotion recognition is an essential task in computer vision due to its various applications and mainly for allowing a more natural interaction between humans and machines. The common approaches for emotion recognition focus on analyzing facial expressions and requires the automatic localization of the face in the image. Although these methods can correctly classify emotion in controlled scenarios, such techniques are limited when dealing with unconstrained daily interactions. We propose a new deep learning approach for emotion recognition based on adaptive multi-cues that extract information from context and body poses, which humans commonly use in social interaction and communication. We compare the proposed approach with the state-of-art approaches in the CAER-S dataset, evaluating different components in a pipeline that reached an accuracy of 89.30%