Natural Language Processing
Automated medical report generation for X-ray exams, red-flag detection in public bidding documents, hate speech and sexism detection, predictive analytics for femicide case sentencing, and seismic trace language modeling applied to natural gas segmentation.
The Natural Language Processing group develops systems that understand, interpret, and generate human language across medical, legal, and social domains.
Medical Report Generation
XRaySwinGen is an automatic medical reporting system for chest X-ray exams built on a multimodal model combining vision transformers and language generation. Published in Heliyon (2024). A follow-up paper extending the approach with relational memory multimodal models was accepted to HCist 2025.
Legal and Public Administration NLP
Detection of red-flag clauses in public bidding documents, supporting government auditors in identifying potentially fraudulent or non-compliant procurement processes.
Predictive analytics for femicide case outcomes and sentencing guidelines, assisting the justice system with evidence-based decision support.
Social Media Analysis
Hate speech and sexism detection in Brazilian Portuguese social media content, contributing to safer online environments and supporting policy research.
Seismic Trace Modeling
The Tracepiece system emulates a language model over seismic trace sequences. Raw seismic amplitude values are quantized, assembled into “T-sentences,” and tokenized using a subword algorithm analogous to SentencePiece. The resulting embeddings are consumed by transformer-based architectures for natural gas detection and segmentation in seismic reflection images, published in Neural Computing & Applications (2024).