Accepted Papers

We are very happy to announce this year’s main conference papers, including our IJDAR track and competition papers!

Journal Track Papers

ID Authors Title
2025-565 Laziz Hamdi
Amine Tamasna
Pascal Boisson
Thierry Paquet
TableSeq: Unified Generation of Structure, Content, and Layout
2025-561 Laziz Hamdi
Amine Tamasna
Pascal Boisson
Thierry Paquet
PILOT: A Promptable Interleaved Layout-aware OCR Transformer
2025-560 Anik De
Abhirama Subramanyam Penamakuru
Rajeev Yadav
Aditya Rathore
Harshiv Shah
Devesh Sharma
Sagar Agarwal
Pravin Kumar
Anand Mishra
Bharat Scene Text: A Novel Comprehensive Dataset and Benchmark for Indian Language Scene Text Understanding
2025-548 Tunga Tessema Chamisso
Blessed Guda
Bereket Retta Adego
Carmel Prosper Sagbo
Gabrial Zencha Ashungafac
Assane Gueye
Fidel: A Large-Scale Sentence Level Amharic OCR Dataset
2025-539 Lukas Arzoumanidis
Julius Knechtel
Jan-Henrik Haunert
Youness Dehbi
Automatic Uncertainty-Aware Synthetic Data Bootstrapping for Historical Map Segmentation
2025-524 Ségolène Albouy
Somkeo Norindr
Paul Kervegan
Fouad Aouniti
Rémy Delanaux
Clara Grometto
Robin Champenois
Stavros Lazaris
Alexandre Guilbaud
Matthieu Husson
Mathieu Aubry
AIKON: A Modular Computer Vision Platform for Historical Corpora
2025-520 Gianluca Dalmasso
Patric Reineri
Mathieu Pscherer Noel
Ninon Achard
Beatrice Caseau
Laurence Likforman Sulem
Davide Cavagnino
Maurizio Lucenteforte
Attilio Fiandrotti
Victoria Eyharabide
Reviving Medieval Byzantine Seals: A Synthetic-to-Real Approach to Character Recognition
2025-504 Rui-Yang Ju
Kohei Yamashita
Hirotaka Kameko
Shinsuke Mori
DKDS: A Benchmark Dataset of Degraded Kuzushiji Documents with Seals for Detection and Binarization
2025-482 Florent Imbert
Simon Corbillé
Hui Han
Elisa H. Barney Smith
A Novel Domain Adaptation Based Pipeline for Character Classification and Handwritten Recognition
2025-479 Phillipe R. Sampaio
Helene Maxicci
Unsupervised Document and Template Clustering using Multimodal Embeddings
2025-468 Anna Zhu
Wei Pan
Guan Li
Hongyi Cai
Kenji Brian
HQ-Font: Few-shot Font Generation via Transferring Hierarchical Quantization Styles
2025-403 Enrique Vidal
Alejandro H. Toselli
Predicting Text Recognition Word Error Rate of Image Documents Without Ground Truth Transcripts

Competition Papers

ID Authors Title
C1 Fahad Ahmed
Jennifer D’Souza
Sören Auer
ICDAR 2026 Competition on Information Extraction from Atomic Layer Deposition/Etching (ALD/E) Scientific Figures
C2 Artemis Llabrés
Marc Serra Ortega
Tomás Ockier
Samuel Ortega Cuadra
Amritpal Singh
Christos Georgakilas
Andrey Barsky
Ernest Valveny
Dimosthenis Karatzas
ICDAR2026 Competition on Multimodal Reasoning over Documents in Multiple Domains
C3 Dominique Stutzmann
Riham Aida Mokrani
Franco Tomasi
Elena Pierazzo
ICDAR 2026 FalsID Competition on Falsification and Imitation Detection
C4 Adrià Molina Rodríguez
Carles Boned Riera
Pau Torras
Oriol Ramos Terrades
Josep Lladós
ICDAR 2026 Competition on Long-Term Handwriting Author Identification
C5 Benjamin Kiessling
Agnès Boutreux
Bram Cars
Matthias Gille Levenson
Mike Kestemont
Anna Michalcová
Ariane Pinche
Caroline Vandyck
Malamatenia Vlachou Efstathiou
Thibault Clérice
ICDAR 2026 Competition on Multilingual Medieval Handwriting Recognition
C6 Maud Ehrmann
Emanuela Boros
Juri Opitz
Andrianos Michail
Florian Wagner
Simon Clematide
ICDAR 2026 HIPE-OCRepair Competition on LLM-Assisted OCR Post-Correction for Historical Documents
C7 Nicholas R. Howe
Aaron Hershkowitz
ICDAR 2026 Competition in Text Recognition on Greek Squeezes
C8 Thomas Gorges
Janne van der Loop
Lukas Hüttner
Linda-Sophie Schneider
Fei Wu
Mathias Seuret
Vincent Christlein
ICDAR 2026 Competition on Writer Identification and Pen Classification from Hand-Drawn Circles

Conference Papers

ID Authors Title
10 Sebastià Nicolau Orell
Adrià Molina Rodríguez
Oriol Ramos Terrades
Josep Lladós Canet
Robust Interpretation of Historical Documents in Knowledge Graphs Through Query Inference and Execution
13 Dhruv Kudale
Udhay Brahmi
Ganesh Ramakrishnan
EMBLEM: Enhancing Multi-script Table Detection through Masking
16 Melissa Cote
Alexandra Branzan Albu
An Exploratory Study of Text-to-Image Generation for Query-by-Example Retrieval of Historical Document Images
17 Paula Font Solà
Adrià Molina Rodríguez
Josep Lladós Canet
Conversational Retrieval and On-the-Fly Knowledge Modeling from Historical Documents
21 Francesc Net Barnes
Adrià Molina Rodríguez
Sofia Llacer-Caro
Lluis Gomez Bigorda
Revisiting how we access to historical archives: Auditing Gender Stereotypes and the Division of Labour in the Analysis of Historical Photography Collections
23 Kumari Priya
Bibek Das
Chandranath Adak
Soumi Chattopadhyay
Master Forgers, Fragile Detectors? A Forensic Study of Vision-Language Models for Signature Verification
27 Ankit Sinha
Atanu Saha
CHIRANJOY CHATTOPADHYAY
Rahul Kumar Ray
Multi-Modal OMR for Heterogeneous Notations: A Collaborative Framework for Real-Time Symbolic-to-Immersive Mapping
28 Jindong Li
Dario Zanca
Vincent Christlein
Tim Hamann
Jens Barth
Peter Kämpf
Björn Eskofier
Enhancing IMU-Based Online Handwriting Recognition via Contrastive Learning with Zero Inference Overhead
30 Michael Zhang
Elise Wang
Charlotte Whatley
Seth Strickland
Dylan Bannon
Democratizing the medieval English legal tradition
31 Man Qin
Tim French
Wei Liu
LMS-Retrieval: Layout-Aware, Modality-Aware, Structure-Aware Document Retrieval
34 Nicolas Angleraud
Antonia Karamolegkou
Benoit Sagot
Thibault Clérice
Structure-Aware Text Recognition for Ancient Greek Critical Editions
38 Tianjiao Cao
Jiahao Lyu
Dongbao Yang
Weimin Mu
Zhou Yu
Towards Breaking the Visual Perception Bottleneck for Geometry Problem Solving
41 Zezhong Guo
yongjian zhang
LLMSFL: LLM-Driven Smart Feedback Loop System for Target Document Generation
43 Xuan Li
Mengfei Li
Jingtian Wei
Jialiang Dong
Raymond Wong
Meaning Lies in Structure: Fine-Grained Table-Centric Document Semantic Parsing
55 Yejing XIE
Ze Qian
Yunfan LI
David Rabouin
Harold Mouchère
HME-Leibniz: A Multi-level Mathematical Expression Dataset from Leibniz’s Manuscripts
57 Rina Buoy
Dylan berkamp Fouepe Dongmo
Vesal Khean
Simone Marinai
Koichi Kise
Towards Non-Latin Character and Layout Personalization for Enhanced Readability
62 Taylor Archibald
Tony Martinez
Improving MLLM Historical Record Extraction with Test-Time Image Augmentation
64 Adrian ISTE
Kazuki Nishizawa
Chisa Tanaka
Andrew Vargo
Anna Scius-Bertrand
Andreas Fischer
Koichi Kise
Prediction of Grade, Gender, and Academic Performance of Children and Teenagers from Handwriting Using the Sigma-Lognormal Model
65 Tim Raven
Tim Hallyburton
Gernot A. Fink
Writer Retrieval at Scale
66 Arnav Sharma
Pratyush Jena
Amal Joseph
Ravi Kiran Sarvadevabhatla
EpiSAM: Character Segmentation in Challenging Stone Inscriptions
72 Ruichang Zhu
Hongxi Wei
Bo Sun
Heng Wang
A MambaVision-Based Cross-Modal Feature Enhancement Network for Scene Text Super-Resolution
73 Marco Pintore
Maura Pintor
Battista Biggio
Dimosthenis Karatzas
Counterfeit Answers: Adversarial Forgery against OCR-Free Document Visual Question Answering
74 Hao Wang
Aouaidjia Kamel
Konstantinos Kotropoulos
Chongsheng Zhang
Parameter-Efficient and Adaptive Fine-Tuning for Long-Tailed Ancient Characters Recognition
81 Tomás Osório
Henrique Lopes Cardoso
RoWeR: RoBERTa Word error Rate estimator for OCRed texts
82 Matthieu PELINGRE
Salvatore TABBONE
Evaluating Vision-Language Models on Historical Postcards
83 Tathagata Ghosh
Sai Madhusudan Gunda
Simran Singh Sandral
Ravi Kiran Sarvadevabhatla
UniLipi: A Unified Multi-Script OCR for Historical Indic Manuscripts
84 Anirudh Srinivasan
Pratyush Jena
Arya Topale
Venkat Kesav
Ravi Kiran Sarvadevabhatla
Patram-Bench: A Comprehensive Multi-task, Multi-domain and Multilingual Benchmark for Indian Document Understanding
85 Sayantan Basu GRACE: Gradient-Regulated Approach for Consistent Explanations
88 Raghuveer R
Anirudh Srinivasan
Venkat Kesav Venna
Shanmukha Sreevatsa Tallapragada
Aryan Jain
Sahithi Kukkala
Ravi Kiran Sarvadevabhatla
REPLICA: An Agentic Framework for Visually Faithful Document Reconstruction
89 Abdullah Ibne Hanif Arean
Niamul Hassan Samin
Md Arifur Rahman
Renu Akter Sweety
Juena Ahmed Noshin
Md Ashikur Rahman
Stroke-Level Connectivity Verification: Grounding Vision-Language Models Against Topology Hallucination in Diagram Understanding
91 Ruiling Li
Danyu Yang
Online Signature Verification Using Augmented Path Signature and T-Mamba
92 Abdurrahman Said Gürbüz
Ahmed Nassar
Christoph Auer
Maksym Lysak
Lucas Morin
Matteo Omenetti
Tim Strohmeyer
Panagiotis Vagenas
Nikolaos Livathinos
Michele Dolfi
Peter Staar
Identify, Locate, Link: End-to-End Key-Value Extraction from Document Images
93 Florent Meyer
Laurent Guichard
Yann Soullard
Denis Coquenet
Guillaume Gravier
Bertrand Coüasnon
n-gram injection into transformers for dynamic language model adaptation in handwritten text recognition
97 Martin Kostelník
Michal Hradiš
Martin Dočekal
CzechTopic: A Benchmark for Zero-Shot Topic Localization in Historical Czech Documents
98 Yuanhui Lin
Hetao Wu
Qingju Jiao
Yongge Liu
Da-Han Wang
Decipherment of Oracle Bone Inscription via Component Deconstruction and Alignment
100 Fabio Quattrini
Carmine Zaccagnino
Costanza Bianchi
Silvia Cascianelli
Rita Cucchiara
A Text Recognition Dataset from Sahidic Coptic Ancient Manuscripts
102 Said Yasin
Torsten Zesch
Ad-hoc Personalization of Offline Handwriting Recognition Using Style Transfer
103 Tobias Lengfeld
Jakob Seitz
Radu Timofte
Evaluating Feedback by Iterative Repair of Multi-Step Solution Documents
113 Koki Maeda
Naoaki Okazaki
JaWildText: A Benchmark for Vision-Language Models on Japanese Scene Text Understanding
114 Baharan Pourahmadi
Panagiotis Leontaridis
Paolo Scattolin
Mads Toudal Frandsen
Blind Image Decomposition for Recovering Overlapping Text Layers on Palimpsests
117 Abantika Bose
Thomas Gorges
Lukas Hüttner
Linda-Sophie Schneider
Mathias Seuret
Fei Wu
Vincent Christlein
An Analysis of Lightweight Models for Document Image Machine Translation
120 John Pavlopoulos
Spyros Barbakos
Lavinia Ferretti
Dionysis Voulgarakis
Asimina Paparrigopoulou
Maria Konstantinidou
Giuseppe De Gregorio
Isabelle Marthot-Santaniello
Paraskevi Platanou
Holger Essler
Learning Diachronic Representations of Ancient Greek Letterforms
122 Heng Wang
Yiming Wang
Hongxi Wei
Preserving High-Fidelity Character Structure in Handwritten Text Generation via Multimodal Guidance
123 Heng Wang
Yiming Wang
Hongxi Wei
P-HTG: One-Shot Handwritten Text Generation via Prototype-Guided Adaptive Gated Fusion
125 shuai li
Xiao-Hui Li
haijie yuan
fei yin
Lin-Lin Huang
GraphVLM: Combining VLMs with GraphMLLM for Document Understanding
126 Ziming Li
Jie Zhang
Xingxiang Zhou
Minzhi Zhang
Zhi Chen
Guanglai Gao
Xiangdong Su
GSMP: Geometry-Structured Masked Pretraining with Multi-Granularity Masking and Curriculum Learning for Geometric Problem Solving
129 Ayman Hanafy
Farhan Khawar
Adaptive Hybrid Machine Translation for E-commerce: A Reinforcement Learning Approach to Arabic Localization
136 Minzhi Zhang
Xingxiang Zhou
Ziming Li
Jie Zhang
Zhi Chen
Xiangdong Su
G2I: A Progressive Structure-to-Detail Curriculum Training Strategy for Handwritten Mathematical Expression Recognition
137 Tom Simon
Pierrick Tranouez
Stephane Nicolas
Clement Chatelain
Thierry Paquet
Few-Shot Writer Adaptation via Multimodal In-Context Learning
138 Merveilles AGBETI-MESSAN
Thierry Paquet
Pierrick Tranouez
Clement Chatelain
Stephane Nicolas
A Benchmark of State-Space Models vs. Transformers and BiLSTM-based Models for Historical Newspaper OCR
139 Xunhui Qin
Desheng Wang
Kunpeng Gui
Fang Shi
Zhonghao Shen
Du Zhou
Ke Liu
Peirong Zhang
Yang Xue
Lianwen Jin
AOSSig4000: A Real-World Chinese Handwritten Signature Dataset with Diverse Background Noise and Pixel-Level Annotations
141 Mélodie Boillet
Solène Tarride
Christopher Kermorvant
METATR: A Multilingual, Evolving Benchmark for Automatic Text Recognition
142 Shinnosuke Matsuo Active Reference Acquisition in Few-shot Font Generation
144 Yataro Tamura
Brian Kenji Iwana
Jiseok Lee
Adversarial Attacks on Online Handwriting using Salience-based Temporal Editing
145 Gabriel Frossard
Franck Gechter
Specialized HTR vs Vision-Language Models: Evaluating DANIEL and Fine-Tuned Qwen on Historical Documents
146 Haotian Chen
Hetao Wu
Qingju Jiao
Yongge Liu
Da-Han Wang
Evolution-Guided Diffusion for Oracle Bone Script Decipherment
147 Yugo Kubota
Kaito Shiku
Seiichi Uchida
Hierarchical Co-Embedding of Font Shapes and Impression Tags
151 Aram Karimi
Jonathan Westine
Gunnar Almevik
Multi-Modal Deep Learning for Medieval Inscription Recognition: A Study of Saint Sophia Cathedral Graffiti
153 Uddipan Basu Bir
Vincent Christlein
Andreas Maier
Mathias Zinnen
From Pixels to Structure: Lightweight Vision-Language Models for Document OCR and Structured JSON Extraction
154 Alexander Epple
Poonam Poonam
Timo Ropinski
Bar-JEPA: Extracting Values from Bar Chart with Joint-Embedding Predictive Architecture
155 SUDEV PADHI
Archana Tiwari
Umesh kashyap
Sk. Subidh Ali
Doc-Protector: A Self-Healing Approach for Digital Documents
156 Qing Lin
Xiaohui Li
Heng Zhang
Fei Yin
Chenglin Liu
SCALES: Scalable Context-Aware Learning with Expert Specialization for Incremental Multilingual Text Recognition
157 Wissam AlKendi
Franck Gechter
Laurent Heyberger
Christophe Guyeux
Automatic Layout Detection in Historical Civil Records Using Deep Object Detection
158 Radoslav Koynov
Triet Ho Anh Doan
Philipp Wieder
Vision Language Models as OCR Correctors for Historical Texts
159 Yangyang Liu
Heng Zhang
Fei Yin
Cheng-Lin Liu
Character Template Representation for Confidence Learning in Handwritten Text Recognition
161 Jiří Mayer
Martina Dvořáková
Vojtěch Dvořák
Markéta Herzánová Vlková
Filip Jebavý
Pavel Pecina
Samuel Šomorjai
Petr Žabička
Jan Hajič, jr.
Optical Music Recognition for Real-World Manuscripts with Synthetic Data
167 Xiaoge Chen
Shilin Li
Leilei Yao
Anna Zhu
Arbitrary Glyph and Multi-Resolution Font Generation with Mixed Content Representations
169 Yingxin Guan
Jian Xing
Zhaohua Zheng
Zhaofu Zeng
Bai Lei
Fanchen Meng
Haitao Guo
HKGC: A Hierarchical Knowledge Graph Construction Framework for Structure-Aware RAG
170 Chen-Yu Xie
Xiao-Hui Li
Fei Yin
Cheng-Lin Liu
DeChart: A Benchmark and Text-Enhanced Chart-to-Table Conversion Method with Multimodal LLMs
173 Eliott THOMAS
Mickael COUSTATY
Aurélie Joseph
Gaspar DELOIN
Vincent Poulain d’Andecy
Jean-Marc OGIER
Active Learning for Cascaded Object Detection: Balancing Coverage and Uncertainty in Table Extraction Pipelines
174 Chen-Yu Xie
Xiao-Hui Li
Boran Wang
Fei Yin
Cheng-Lin Liu
CPAgent: A Tool-Augmented Agentic Framework for Chart Parsing
175 Erik Lenas
Viktoria Lofgren
Olof Karsvall
Quality Prediction for Large Scale HTR – Confidence Is All You Need
183 Sharva Gogawale
Iddo Hakim
Gal Grudka
Mohammad Suliman
Omer Ventura
Daria Shapira
Berat Barakat
Nachum Dershowitz
Complex Layout Classification in the Wild: A Low-Resource Approach with Layout-Preserving Augmentations
192 Eliott THOMAS
Tri-Cong PHAM
Mickael COUSTATY
Aurélie JOSEPH
Gaspar DELOIN
Vincent Poulain d’Andecy
Jean-Marc OGIER
Antoine DOUCET
ConRTF: Edge-Constrained Boundary Distribution Refinement for Realtime TransFormer Table Structure Recognition
197 Anmol Gulati
Sahil Sen
Waqar Sarguroh
Kevin Paul
Beyond Rows to Reasoning: Agentic Retrieval for Multimodal Spreadsheet Understanding
200 Marco Peer
Anna Scius-Bertrand
Patricia Scheurer
Andreas Fischer
BullingerDB: A Dataset for Handwritten Text Recognition and Writer Retrieval
203 Maxim Novopoltsev
Ruslan Murtazin
Andrey Sakhovskiy
Emilia Bojarskaja
Vladimir Kokh
Ivan Ulitin
Botirjon Abdullayev
Khamidulla Aminov
Masudkhon Ismoilov
Semen Budennyy
A Millennium of Arabic Manuscripts in Three Styles: A Line-Level OCR Benchmark for Naskh, Taliq, and Nastaliq
204 Jie Zhang
Xiangren Wang
Ziming Li
Minzhi Zhang
Xingxiang Zhou
Zhi Chen
Guanglai Gao
Xiangdong Su
EAGLE: Explicit Anchoring and Graph Reasoning with Diagram Structure Priors for Multimodal Geometry Problem Solving
206 Ryo Ishiyama
Takaya Kawakatsu
Ambiguity-Controlled Handwritten Mathematical Expression Generation via Harmonized Dual-Conditional Guidance
208 Cuong Nguyen
Khoa Nguyen Tran
Ngoc Tuan Nguyen
Hung Tuan Nguyen
Nam Tuan Ly
Masaki Nakagawa
Automated Character-Level Annotation for Historical Nom Documents via an Iterative Self-Updating Radical-Aware Recognizer
209 Yu Tang
Hongwei Li
Yixuan Cao
Ping Luo
Beyond the Page Break: An LLM-based Solution for Cross-Page Table Reconstruction
211 Bingke Li
Jinghan Li
Jinhao Chen
Wu Zhuang
Yuxiang Zhang
TKPE: Topic-based Evaluation for Keyphrase Prediction
213 Wei Wei
Xinrui Liu
Jianxin Zhang
Xiaodong Duan
MaPE-Former: A Mask-Aware Position Encoding Network for Chinese Character Image Restoration
216 Thanh-Nghia Truong
Hung Tuan Nguyen
Nam Tuan Ly
Yoichi Tsuchida
Hiroshi Miyazawa
Tomo Asakura
Masamitsu Ito
Toshihiko Horie
Fumiko Yasuno
Masaki Nakagawa
Hierarchical Stroke-Level Clustering and Step-Level Segmentation for Automatic Scoring of Geometric Construction Answers with an Electronic Drawing Compass
225 Marry Kong
Rina Buoy
Sovisal Chenda
Nguonly Taing
Masakazu Iwamura
Koichi Kise
Towards Universal Khmer Text Recognition
227 Marry Kong
Rina Buoy
Sovisal Chenda
Nguonly Taing
Masakazu Iwamura
Koichi Kise
Towards Khmer Scene Document Layout Detection
228 Jan Philipp Bullenkamp
Florian Linsel
Lisa Wilhelmi
Hubert Mara
Synthetic Training Data Generation for 3D Cuneiform Sign Recognition
230 Stephan Unter
Elena Hertel
DDD – A Diagnostic Dataset for Character Recognition and Detection on Ancient Egyptian Hieratic Characters and Words
231 Shree Mitra
Ajoy Mondal
C. V. Jawahar
Can VLMs Understand Handwritten Mathematical Documents?
232 MIN SONG
Kenny Davila
Synthetic Data from Simulated Lecture Environments for Handwritten Content Extraction
234 Sanket Deshmukh
Apurva Gala
David Blom
Detlef Hohl
Towards Scalable Knowledge Graph Extraction from Piping and Instrumentation Diagrams
239 Koki Fujita
Hideaki Yajima
Chee Siang Leow
Hiromitsu Nishizaki
Reference-Free Handwritten Japanese Character Generation via CLIP-Conditioned Diffusion Models
242 Laziz Hamdi
Amine Tamasna
Pascal Boisson
Thierry Paquet
FastTab: A Fast Table Recognizer with a Tiny Recursive Module and 1D Transformers
247 Mengyuan Zhao
Kun Xu
Xin Cheng
Ting Li
Qiuman Tan
Xinyao Zhang
DocCenter: Center and Corner Aware Representation for Robust Multi-Document Localization
250 Salman K H
Chakravarthy Bhagvati
BinDiffuser: Learning Binary Style Priors to Guide Diffusion Models for Palm-Leaf Document Binarization
251 Stanislas Bagnol
Killian Barrere
Veronique Eglin
Elöd Egyed-Zsigmond
David Pitaval
Jean-Marie Côme
GeoLogVQA: A Borehole Log Documents Dataset for Explicit and Implicit Spatial Reasoning
252 Tobias Steiner
Merlin Streilein
Andreas Fischer
Kaspar Riesen
Benchmarking Information Retrieval for Large Archives of Historical Documents
256 Zeynep Sonat Baltaci
Raphael Baena
Fei Meng
Somkeo Norindr
Florence Somer
Matthieu Husson
Mathieu Aubry
Text region detection in historical astronomical diagrams
258 Nick Jochum
Tobias Alt-Veit
Christian Schön
Alexander Lück
René Schuster
Didier Stricker
Bounding Box Label Propagation for Re-Annotation of Document Layout Analysis Datasets
259 Takaya Kawakatsu Revisiting Structural Dependency in Autoregressive Multi-Task Table Recognition via Order-Independent Cell-Level Representations
260 Ayush Lodh
Souparni Mazumder
Sanket Biswas
Josep Llados
Nisha Singh
From Chunks to Graphs: Training-Free Multimodal Late Interaction for Document Understanding
261 MALAMATENIA VLACHOU EFSTATHIOU
Raphaël Baena
Dominique Stutzmann
Mathieu Aubry
Leveraging Morphology for Historical Script Metrological Analysis
270 Yngve Mardal Moe
Marie Roald
Stringalign: Moving beyond summary statistics with a transparent Unicode-aware tool for evaluating automatic transcription models
280 Merlin Streilein
Tobias Steiner
Andreas Fischer
Kaspar Riesen
Token Selection Strategies for Automatic Summarization of Historical Documents
282 Jihad Al Akl
Chady Abou Jaoude
Zahi Al Chami
Marianne Abi Kanaan
Abdallah Makhoul
HIDRA: Hierarchical Ink-aware Dual-granularity Retrieval Architecture for Historical Fragments
286 Debayan Das Gupta
Shivakumara Palaiahnakote
Palash Ghosh
Umapada Pal
Cheng-Lin Liu
Diffusion-Based Multi-View Reasoning for Scene Text Detection
289 Axel De Nardin
Silvia Zottin
Claudio Piciarelli
Gian Luca Foresti
GRaF-Net: a Multi-Branch Gated Residual Architecture for Floor Plan Semantic Segmentation
298 Stephan Unter
Chang Liu
Elisa Barney Smith
Generalized Open-set Single-shot Character Recognition on Ancient Egyptian Hieratic Characters
300 Diego Belzarena
Seginus Mowlavi
Paula Casariego Castiñeira
Alejandra Ulla Lorenzo
Gregory Randall
Jean-Michel Morel
Theatre Chapbooks At Scale: A Statistical Comparative Analysis of Typography
301 Nam Nguyen
Emanuela Boros
Adam Jatowt
Ahmed Hamdi
Mickael Coustaty
Antoine Doucet
One Model, Many Guidelines: Instruction Fine-Tuning for Historical Named Entity Recognition
303 Dipendra Sharma Kafle
Esma Talhi
Mickael Coustaty
Antoine Doucet
RAGXDoc: Structured Knowledge-guided Retrieval and Explainable Re-ranking for Academic Documents
304 Glen Pouliquen
Joseph Chazalon
Guillaume Chiron
Oriol Ramos Terrades
Thierry Geraud
Ahmad Montaser Awal
Temporal Modeling of Optically Variable Devices in Identity Documents
306 François Wieckowiak
Véronique Eglin
Tony Bonnet
Stéphane Bres
Laëtitia Rousseau
PatentME: A Dataset and Reference-Free Post-OCR Verification Task for Printed Mathematical Expression Recognition
309 Achyuth P
Kahaan Shah
Chetan Arora
What Can Languages of the Global South Teach Each Other?
311 Robin Armingaud
Romaric Besançon
GLiDRE: Generalist Lightweight Model for Document-level Relation Extraction
312 Silvia Zottin
Axel De Nardin
Valentina Mignosa
Maddalena Zunino
Gian Luca Foresti
Bridging the Gaps: Learning to Estimate Missing Text in Fragmentary Greek Inscriptions
313 Ari Vesalainen
Eetu Mäkelä
Laura Ruotsalainen
Mikko Tolonen
Error Patterns in Historical OCR: A Comparative Analysis of TrOCR and a Vision–Language Model
314 Amritansh Maurya
Navjot Singh
Mohammed Javed
Omar Moured
Efficient Table QA via TableGrid Navigation and Progressive Inference Prompting
316 Arthur Matei
Tim Hallyburton
Lukas Hennies
Christoph Rass
Gernot A. Fink
Recent Advances in Information Extraction from Historical Archival Records
318 Tayyab Raza
Syed Muhammad Taha Imam
Adrian Ulges
Ulrich Schwanecke
Momina Moetesum
Faisal Shafait
LiteDoc: Distilling Large Document Models into Efficient Task-Specific Encoders
320 Nimol Thuon
Jun Du
Ranysakol Thuon
Panhapin Theang
Angkorian-KSI: A Multi-Task Benchmark for Khmer Stone Inscription Analysis
321 Kylian Ronfleux Corail
Nicolas Sidere
Guillaume Bernard
Mickael Coustaty
Improving Document Forgery Localization Robustness via Diverse JPEG Quantization Tables
327 Ibtissem HAJ ALI,
Harold Mouchère
Spatially-Grounded Gaussian-Prior Attention for Handwritten Mathematical Expression Recognition
330 Saima Kausar
Ayesha Amjad
Ahmad Sarmad Ali
Momina Moetesum
Adnan ul Hasan
Faisal Shafait
DiffusionRec: Recognition-Guided Diffusion for Content-Aware Urdu Handwriting Generation
339 Swagata Mukherjee
Samar Kumar Srivastava
Sriparna Saha
TimeAgent: From Matches to Memories — Timeline Summarization for Sports Analytics
341 Daichi Haraguchi Structural Analysis of Character Identity at OCR Decision Boundaries in Visually Similar Pairs
342 Yiming Xu
Eric López
Artemis Llabrés
Maximiliano Hormazábal
Ernest Valveny
Dimosthenis Karatzas
AdaNav: Query-Adaptive Multi-Granularity Navigation for Long Document Understanding
347 Jakob Seitz
Tobias Lengfeld
Radu Timofte
InkTree: A Unified Representation of Structured Online Ink
349 Bernhard Ortbauer
Tobias Doppler
Pauline Schmidt
Lukas Schilcher
Wolfgang Göderle
Malte Rehbein
Alexander Werth
Roman Kern
ADV-FORMS: A Dataset of Form-Based Historical Documents With Benchmarks for Layout Analysis, HTR and OCR
351 Fahad Alotaibi
Daulet Toibazar
Renad Almusaad
Ranya Alkahtani
Haneen Alhomoud
Asma Ibrahim
Yazeed Alharbi
Murtadha Aljubran
Pedro Moreno
Doc2Doc: Structure-Aware Generative Rendering for Bi-Directional Document Translation
355 Ali Hussain
Rafay Ahmad
Momina Moetesum
Adnan Ul-Hasan
Faisal Shafait
Online Urdu Text-Line Recognition by Bridging Stroke Dynamics and Offline Representations
356 Karen Lee
Dhanashree Balaram
Seojun Shon
Umair Rasheed
MIDAS: Multi-LLM Iterative Data-Adaptive Summarization
370 Keito Sasagawa
Shuhei Kurita
Daisuke Kawahara
Synth-JDoc: Synthesizing a Japanese Document Image Dataset for OCR with Diverse Layouts and Embedded Images
379 Hiroki Nagamatsu
Shoji Toyota
Seiichi Uchida
Handwriting Trajectory Recovery with Diffusion Models
380 Nour Atamni
Boraq Madi
Islam Amar
Raid Saabni
Jihad El-Sana
Beyond Labels: Visual Invariance in Self-Supervised Learning for Aramaic Incantation Bowls
384 Hira Masood
Momina Moetesum
Muhammad Imran Malik
Faisal Shafait
Hassan Aqeel Khan
Agentic Document Reasoning for Evidence-Grounded Clinical Report Generation
385 Rajat Verma
Vriti Sharma
Manikandan Ravikiran
Rohit Saluja
MultiFOLD: A Multimodal Framework to correct OCR Lapses in cluttered Documents
387 Jawad Ibn Ahad
Mritunjoy Chakraborty
Fuad Rahman
Sifat Momen
Shafin Rahman
Nabeel Mohammed
Figures as Evidence: Multi-Image Scientific Generation
391 Yifan Huang
Liangrui Peng
Tianqi Zhao
Di Wu
Kemeng Zhao
Shuo Li
Zhiyu Li
Yuyang Li
Vision-Language Model based Transfer Learning for Historical Document Recognition
393 Nam Tuan Ly
Atsuhiro Takasu
Masaki Nakagawa
Multilingual Table Recognition: A Benchmark Dataset and A Local–Global Hybrid Model