We are very happy to announce this year’s main conference papers, including our IJDAR track and competition papers!
Journal Track Papers
| ID | Authors | Title |
|---|---|---|
| 2025-565 |
Laziz Hamdi Amine Tamasna Pascal Boisson Thierry Paquet |
TableSeq: Unified Generation of Structure, Content, and Layout |
| 2025-561 |
Laziz Hamdi Amine Tamasna Pascal Boisson Thierry Paquet |
PILOT: A Promptable Interleaved Layout-aware OCR Transformer |
| 2025-560 |
Anik De Abhirama Subramanyam Penamakuru Rajeev Yadav Aditya Rathore Harshiv Shah Devesh Sharma Sagar Agarwal Pravin Kumar Anand Mishra |
Bharat Scene Text: A Novel Comprehensive Dataset and Benchmark for Indian Language Scene Text Understanding |
| 2025-548 |
Tunga Tessema Chamisso Blessed Guda Bereket Retta Adego Carmel Prosper Sagbo Gabrial Zencha Ashungafac Assane Gueye |
Fidel: A Large-Scale Sentence Level Amharic OCR Dataset |
| 2025-539 |
Lukas Arzoumanidis Julius Knechtel Jan-Henrik Haunert Youness Dehbi |
Automatic Uncertainty-Aware Synthetic Data Bootstrapping for Historical Map Segmentation |
| 2025-524 |
Ségolène Albouy Somkeo Norindr Paul Kervegan Fouad Aouniti Rémy Delanaux Clara Grometto Robin Champenois Stavros Lazaris Alexandre Guilbaud Matthieu Husson Mathieu Aubry |
AIKON: A Modular Computer Vision Platform for Historical Corpora |
| 2025-520 |
Gianluca Dalmasso Patric Reineri Mathieu Pscherer Noel Ninon Achard Beatrice Caseau Laurence Likforman Sulem Davide Cavagnino Maurizio Lucenteforte Attilio Fiandrotti Victoria Eyharabide |
Reviving Medieval Byzantine Seals: A Synthetic-to-Real Approach to Character Recognition |
| 2025-504 |
Rui-Yang Ju Kohei Yamashita Hirotaka Kameko Shinsuke Mori |
DKDS: A Benchmark Dataset of Degraded Kuzushiji Documents with Seals for Detection and Binarization |
| 2025-482 |
Florent Imbert Simon Corbillé Hui Han Elisa H. Barney Smith |
A Novel Domain Adaptation Based Pipeline for Character Classification and Handwritten Recognition |
| 2025-479 |
Phillipe R. Sampaio Helene Maxicci |
Unsupervised Document and Template Clustering using Multimodal Embeddings |
| 2025-468 |
Anna Zhu Wei Pan Guan Li Hongyi Cai Kenji Brian |
HQ-Font: Few-shot Font Generation via Transferring Hierarchical Quantization Styles |
| 2025-403 |
Enrique Vidal Alejandro H. Toselli |
Predicting Text Recognition Word Error Rate of Image Documents Without Ground Truth Transcripts |
Competition Papers
| ID | Authors | Title |
|---|---|---|
| C1 |
Fahad Ahmed Jennifer D’Souza Sören Auer |
ICDAR 2026 Competition on Information Extraction from Atomic Layer Deposition/Etching (ALD/E) Scientific Figures |
| C2 |
Artemis Llabrés Marc Serra Ortega Tomás Ockier Samuel Ortega Cuadra Amritpal Singh Christos Georgakilas Andrey Barsky Ernest Valveny Dimosthenis Karatzas |
ICDAR2026 Competition on Multimodal Reasoning over Documents in Multiple Domains |
| C3 |
Dominique Stutzmann Riham Aida Mokrani Franco Tomasi Elena Pierazzo |
ICDAR 2026 FalsID Competition on Falsification and Imitation Detection |
| C4 |
Adrià Molina Rodríguez Carles Boned Riera Pau Torras Oriol Ramos Terrades Josep Lladós |
ICDAR 2026 Competition on Long-Term Handwriting Author Identification |
| C5 |
Benjamin Kiessling Agnès Boutreux Bram Cars Matthias Gille Levenson Mike Kestemont Anna Michalcová Ariane Pinche Caroline Vandyck Malamatenia Vlachou Efstathiou Thibault Clérice |
ICDAR 2026 Competition on Multilingual Medieval Handwriting Recognition |
| C6 |
Maud Ehrmann Emanuela Boros Juri Opitz Andrianos Michail Florian Wagner Simon Clematide |
ICDAR 2026 HIPE-OCRepair Competition on LLM-Assisted OCR Post-Correction for Historical Documents |
| C7 |
Nicholas R. Howe Aaron Hershkowitz |
ICDAR 2026 Competition in Text Recognition on Greek Squeezes |
| C8 |
Thomas Gorges Janne van der Loop Lukas Hüttner Linda-Sophie Schneider Fei Wu Mathias Seuret Vincent Christlein |
ICDAR 2026 Competition on Writer Identification and Pen Classification from Hand-Drawn Circles |
Conference Papers
| ID | Authors | Title |
|---|---|---|
| 10 | Sebastià Nicolau Orell Adrià Molina Rodríguez Oriol Ramos Terrades Josep Lladós Canet |
Robust Interpretation of Historical Documents in Knowledge Graphs Through Query Inference and Execution |
| 13 | Dhruv Kudale Udhay Brahmi Ganesh Ramakrishnan |
EMBLEM: Enhancing Multi-script Table Detection through Masking |
| 16 | Melissa Cote Alexandra Branzan Albu |
An Exploratory Study of Text-to-Image Generation for Query-by-Example Retrieval of Historical Document Images |
| 17 | Paula Font Solà Adrià Molina Rodríguez Josep Lladós Canet |
Conversational Retrieval and On-the-Fly Knowledge Modeling from Historical Documents |
| 21 | Francesc Net Barnes Adrià Molina Rodríguez Sofia Llacer-Caro Lluis Gomez Bigorda |
Revisiting how we access to historical archives: Auditing Gender Stereotypes and the Division of Labour in the Analysis of Historical Photography Collections |
| 23 | Kumari Priya Bibek Das Chandranath Adak Soumi Chattopadhyay |
Master Forgers, Fragile Detectors? A Forensic Study of Vision-Language Models for Signature Verification |
| 27 | Ankit Sinha Atanu Saha CHIRANJOY CHATTOPADHYAY Rahul Kumar Ray |
Multi-Modal OMR for Heterogeneous Notations: A Collaborative Framework for Real-Time Symbolic-to-Immersive Mapping |
| 28 | Jindong Li Dario Zanca Vincent Christlein Tim Hamann Jens Barth Peter Kämpf Björn Eskofier |
Enhancing IMU-Based Online Handwriting Recognition via Contrastive Learning with Zero Inference Overhead |
| 30 | Michael Zhang Elise Wang Charlotte Whatley Seth Strickland Dylan Bannon |
Democratizing the medieval English legal tradition |
| 31 | Man Qin Tim French Wei Liu |
LMS-Retrieval: Layout-Aware, Modality-Aware, Structure-Aware Document Retrieval |
| 34 | Nicolas Angleraud Antonia Karamolegkou Benoit Sagot Thibault Clérice |
Structure-Aware Text Recognition for Ancient Greek Critical Editions |
| 38 | Tianjiao Cao Jiahao Lyu Dongbao Yang Weimin Mu Zhou Yu |
Towards Breaking the Visual Perception Bottleneck for Geometry Problem Solving |
| 41 | Zezhong Guo yongjian zhang |
LLMSFL: LLM-Driven Smart Feedback Loop System for Target Document Generation |
| 43 | Xuan Li Mengfei Li Jingtian Wei Jialiang Dong Raymond Wong |
Meaning Lies in Structure: Fine-Grained Table-Centric Document Semantic Parsing |
| 55 | Yejing XIE Ze Qian Yunfan LI David Rabouin Harold Mouchère |
HME-Leibniz: A Multi-level Mathematical Expression Dataset from Leibniz’s Manuscripts |
| 57 | Rina Buoy Dylan berkamp Fouepe Dongmo Vesal Khean Simone Marinai Koichi Kise |
Towards Non-Latin Character and Layout Personalization for Enhanced Readability |
| 62 | Taylor Archibald Tony Martinez |
Improving MLLM Historical Record Extraction with Test-Time Image Augmentation |
| 64 | Adrian ISTE Kazuki Nishizawa Chisa Tanaka Andrew Vargo Anna Scius-Bertrand Andreas Fischer Koichi Kise |
Prediction of Grade, Gender, and Academic Performance of Children and Teenagers from Handwriting Using the Sigma-Lognormal Model |
| 65 | Tim Raven Tim Hallyburton Gernot A. Fink |
Writer Retrieval at Scale |
| 66 | Arnav Sharma Pratyush Jena Amal Joseph Ravi Kiran Sarvadevabhatla |
EpiSAM: Character Segmentation in Challenging Stone Inscriptions |
| 72 | Ruichang Zhu Hongxi Wei Bo Sun Heng Wang |
A MambaVision-Based Cross-Modal Feature Enhancement Network for Scene Text Super-Resolution |
| 73 | Marco Pintore Maura Pintor Battista Biggio Dimosthenis Karatzas |
Counterfeit Answers: Adversarial Forgery against OCR-Free Document Visual Question Answering |
| 74 | Hao Wang Aouaidjia Kamel Konstantinos Kotropoulos Chongsheng Zhang |
Parameter-Efficient and Adaptive Fine-Tuning for Long-Tailed Ancient Characters Recognition |
| 81 | Tomás Osório Henrique Lopes Cardoso |
RoWeR: RoBERTa Word error Rate estimator for OCRed texts |
| 82 | Matthieu PELINGRE Salvatore TABBONE |
Evaluating Vision-Language Models on Historical Postcards |
| 83 | Tathagata Ghosh Sai Madhusudan Gunda Simran Singh Sandral Ravi Kiran Sarvadevabhatla |
UniLipi: A Unified Multi-Script OCR for Historical Indic Manuscripts |
| 84 | Anirudh Srinivasan Pratyush Jena Arya Topale Venkat Kesav Ravi Kiran Sarvadevabhatla |
Patram-Bench: A Comprehensive Multi-task, Multi-domain and Multilingual Benchmark for Indian Document Understanding |
| 85 | Sayantan Basu | GRACE: Gradient-Regulated Approach for Consistent Explanations |
| 88 | Raghuveer R Anirudh Srinivasan Venkat Kesav Venna Shanmukha Sreevatsa Tallapragada Aryan Jain Sahithi Kukkala Ravi Kiran Sarvadevabhatla |
REPLICA: An Agentic Framework for Visually Faithful Document Reconstruction |
| 89 | Abdullah Ibne Hanif Arean Niamul Hassan Samin Md Arifur Rahman Renu Akter Sweety Juena Ahmed Noshin Md Ashikur Rahman |
Stroke-Level Connectivity Verification: Grounding Vision-Language Models Against Topology Hallucination in Diagram Understanding |
| 91 | Ruiling Li Danyu Yang |
Online Signature Verification Using Augmented Path Signature and T-Mamba |
| 92 | Abdurrahman Said Gürbüz Ahmed Nassar Christoph Auer Maksym Lysak Lucas Morin Matteo Omenetti Tim Strohmeyer Panagiotis Vagenas Nikolaos Livathinos Michele Dolfi Peter Staar |
Identify, Locate, Link: End-to-End Key-Value Extraction from Document Images |
| 93 | Florent Meyer Laurent Guichard Yann Soullard Denis Coquenet Guillaume Gravier Bertrand Coüasnon |
n-gram injection into transformers for dynamic language model adaptation in handwritten text recognition |
| 97 | Martin Kostelník Michal Hradiš Martin Dočekal |
CzechTopic: A Benchmark for Zero-Shot Topic Localization in Historical Czech Documents |
| 98 | Yuanhui Lin Hetao Wu Qingju Jiao Yongge Liu Da-Han Wang |
Decipherment of Oracle Bone Inscription via Component Deconstruction and Alignment |
| 100 | Fabio Quattrini Carmine Zaccagnino Costanza Bianchi Silvia Cascianelli Rita Cucchiara |
A Text Recognition Dataset from Sahidic Coptic Ancient Manuscripts |
| 102 | Said Yasin Torsten Zesch |
Ad-hoc Personalization of Offline Handwriting Recognition Using Style Transfer |
| 103 | Tobias Lengfeld Jakob Seitz Radu Timofte |
Evaluating Feedback by Iterative Repair of Multi-Step Solution Documents |
| 113 | Koki Maeda Naoaki Okazaki |
JaWildText: A Benchmark for Vision-Language Models on Japanese Scene Text Understanding |
| 114 | Baharan Pourahmadi Panagiotis Leontaridis Paolo Scattolin Mads Toudal Frandsen |
Blind Image Decomposition for Recovering Overlapping Text Layers on Palimpsests |
| 117 | Abantika Bose Thomas Gorges Lukas Hüttner Linda-Sophie Schneider Mathias Seuret Fei Wu Vincent Christlein |
An Analysis of Lightweight Models for Document Image Machine Translation |
| 120 | John Pavlopoulos Spyros Barbakos Lavinia Ferretti Dionysis Voulgarakis Asimina Paparrigopoulou Maria Konstantinidou Giuseppe De Gregorio Isabelle Marthot-Santaniello Paraskevi Platanou Holger Essler |
Learning Diachronic Representations of Ancient Greek Letterforms |
| 122 | Heng Wang Yiming Wang Hongxi Wei |
Preserving High-Fidelity Character Structure in Handwritten Text Generation via Multimodal Guidance |
| 123 | Heng Wang Yiming Wang Hongxi Wei |
P-HTG: One-Shot Handwritten Text Generation via Prototype-Guided Adaptive Gated Fusion |
| 125 | shuai li Xiao-Hui Li haijie yuan fei yin Lin-Lin Huang |
GraphVLM: Combining VLMs with GraphMLLM for Document Understanding |
| 126 | Ziming Li Jie Zhang Xingxiang Zhou Minzhi Zhang Zhi Chen Guanglai Gao Xiangdong Su |
GSMP: Geometry-Structured Masked Pretraining with Multi-Granularity Masking and Curriculum Learning for Geometric Problem Solving |
| 129 | Ayman Hanafy Farhan Khawar |
Adaptive Hybrid Machine Translation for E-commerce: A Reinforcement Learning Approach to Arabic Localization |
| 136 | Minzhi Zhang Xingxiang Zhou Ziming Li Jie Zhang Zhi Chen Xiangdong Su |
G2I: A Progressive Structure-to-Detail Curriculum Training Strategy for Handwritten Mathematical Expression Recognition |
| 137 | Tom Simon Pierrick Tranouez Stephane Nicolas Clement Chatelain Thierry Paquet |
Few-Shot Writer Adaptation via Multimodal In-Context Learning |
| 138 | Merveilles AGBETI-MESSAN Thierry Paquet Pierrick Tranouez Clement Chatelain Stephane Nicolas |
A Benchmark of State-Space Models vs. Transformers and BiLSTM-based Models for Historical Newspaper OCR |
| 139 | Xunhui Qin Desheng Wang Kunpeng Gui Fang Shi Zhonghao Shen Du Zhou Ke Liu Peirong Zhang Yang Xue Lianwen Jin |
AOSSig4000: A Real-World Chinese Handwritten Signature Dataset with Diverse Background Noise and Pixel-Level Annotations |
| 141 | Mélodie Boillet Solène Tarride Christopher Kermorvant |
METATR: A Multilingual, Evolving Benchmark for Automatic Text Recognition |
| 142 | Shinnosuke Matsuo | Active Reference Acquisition in Few-shot Font Generation |
| 144 | Yataro Tamura Brian Kenji Iwana Jiseok Lee |
Adversarial Attacks on Online Handwriting using Salience-based Temporal Editing |
| 145 | Gabriel Frossard Franck Gechter |
Specialized HTR vs Vision-Language Models: Evaluating DANIEL and Fine-Tuned Qwen on Historical Documents |
| 146 | Haotian Chen Hetao Wu Qingju Jiao Yongge Liu Da-Han Wang |
Evolution-Guided Diffusion for Oracle Bone Script Decipherment |
| 147 | Yugo Kubota Kaito Shiku Seiichi Uchida |
Hierarchical Co-Embedding of Font Shapes and Impression Tags |
| 151 | Aram Karimi Jonathan Westine Gunnar Almevik |
Multi-Modal Deep Learning for Medieval Inscription Recognition: A Study of Saint Sophia Cathedral Graffiti |
| 153 | Uddipan Basu Bir Vincent Christlein Andreas Maier Mathias Zinnen |
From Pixels to Structure: Lightweight Vision-Language Models for Document OCR and Structured JSON Extraction |
| 154 | Alexander Epple Poonam Poonam Timo Ropinski |
Bar-JEPA: Extracting Values from Bar Chart with Joint-Embedding Predictive Architecture |
| 155 | SUDEV PADHI Archana Tiwari Umesh kashyap Sk. Subidh Ali |
Doc-Protector: A Self-Healing Approach for Digital Documents |
| 156 | Qing Lin Xiaohui Li Heng Zhang Fei Yin Chenglin Liu |
SCALES: Scalable Context-Aware Learning with Expert Specialization for Incremental Multilingual Text Recognition |
| 157 | Wissam AlKendi Franck Gechter Laurent Heyberger Christophe Guyeux |
Automatic Layout Detection in Historical Civil Records Using Deep Object Detection |
| 158 | Radoslav Koynov Triet Ho Anh Doan Philipp Wieder |
Vision Language Models as OCR Correctors for Historical Texts |
| 159 | Yangyang Liu Heng Zhang Fei Yin Cheng-Lin Liu |
Character Template Representation for Confidence Learning in Handwritten Text Recognition |
| 161 | Jiří Mayer Martina Dvořáková Vojtěch Dvořák Markéta Herzánová Vlková Filip Jebavý Pavel Pecina Samuel Šomorjai Petr Žabička Jan Hajič, jr. |
Optical Music Recognition for Real-World Manuscripts with Synthetic Data |
| 167 | Xiaoge Chen Shilin Li Leilei Yao Anna Zhu |
Arbitrary Glyph and Multi-Resolution Font Generation with Mixed Content Representations |
| 169 | Yingxin Guan Jian Xing Zhaohua Zheng Zhaofu Zeng Bai Lei Fanchen Meng Haitao Guo |
HKGC: A Hierarchical Knowledge Graph Construction Framework for Structure-Aware RAG |
| 170 | Chen-Yu Xie Xiao-Hui Li Fei Yin Cheng-Lin Liu |
DeChart: A Benchmark and Text-Enhanced Chart-to-Table Conversion Method with Multimodal LLMs |
| 173 | Eliott THOMAS Mickael COUSTATY Aurélie Joseph Gaspar DELOIN Vincent Poulain d’Andecy Jean-Marc OGIER |
Active Learning for Cascaded Object Detection: Balancing Coverage and Uncertainty in Table Extraction Pipelines |
| 174 | Chen-Yu Xie Xiao-Hui Li Boran Wang Fei Yin Cheng-Lin Liu |
CPAgent: A Tool-Augmented Agentic Framework for Chart Parsing |
| 175 | Erik Lenas Viktoria Lofgren Olof Karsvall |
Quality Prediction for Large Scale HTR – Confidence Is All You Need |
| 183 | Sharva Gogawale Iddo Hakim Gal Grudka Mohammad Suliman Omer Ventura Daria Shapira Berat Barakat Nachum Dershowitz |
Complex Layout Classification in the Wild: A Low-Resource Approach with Layout-Preserving Augmentations |
| 192 | Eliott THOMAS Tri-Cong PHAM Mickael COUSTATY Aurélie JOSEPH Gaspar DELOIN Vincent Poulain d’Andecy Jean-Marc OGIER Antoine DOUCET |
ConRTF: Edge-Constrained Boundary Distribution Refinement for Realtime TransFormer Table Structure Recognition |
| 197 | Anmol Gulati Sahil Sen Waqar Sarguroh Kevin Paul |
Beyond Rows to Reasoning: Agentic Retrieval for Multimodal Spreadsheet Understanding |
| 200 | Marco Peer Anna Scius-Bertrand Patricia Scheurer Andreas Fischer |
BullingerDB: A Dataset for Handwritten Text Recognition and Writer Retrieval |
| 203 | Maxim Novopoltsev Ruslan Murtazin Andrey Sakhovskiy Emilia Bojarskaja Vladimir Kokh Ivan Ulitin Botirjon Abdullayev Khamidulla Aminov Masudkhon Ismoilov Semen Budennyy |
A Millennium of Arabic Manuscripts in Three Styles: A Line-Level OCR Benchmark for Naskh, Taliq, and Nastaliq |
| 204 | Jie Zhang Xiangren Wang Ziming Li Minzhi Zhang Xingxiang Zhou Zhi Chen Guanglai Gao Xiangdong Su |
EAGLE: Explicit Anchoring and Graph Reasoning with Diagram Structure Priors for Multimodal Geometry Problem Solving |
| 206 | Ryo Ishiyama Takaya Kawakatsu |
Ambiguity-Controlled Handwritten Mathematical Expression Generation via Harmonized Dual-Conditional Guidance |
| 208 | Cuong Nguyen Khoa Nguyen Tran Ngoc Tuan Nguyen Hung Tuan Nguyen Nam Tuan Ly Masaki Nakagawa |
Automated Character-Level Annotation for Historical Nom Documents via an Iterative Self-Updating Radical-Aware Recognizer |
| 209 | Yu Tang Hongwei Li Yixuan Cao Ping Luo |
Beyond the Page Break: An LLM-based Solution for Cross-Page Table Reconstruction |
| 211 | Bingke Li Jinghan Li Jinhao Chen Wu Zhuang Yuxiang Zhang |
TKPE: Topic-based Evaluation for Keyphrase Prediction |
| 213 | Wei Wei Xinrui Liu Jianxin Zhang Xiaodong Duan |
MaPE-Former: A Mask-Aware Position Encoding Network for Chinese Character Image Restoration |
| 216 | Thanh-Nghia Truong Hung Tuan Nguyen Nam Tuan Ly Yoichi Tsuchida Hiroshi Miyazawa Tomo Asakura Masamitsu Ito Toshihiko Horie Fumiko Yasuno Masaki Nakagawa |
Hierarchical Stroke-Level Clustering and Step-Level Segmentation for Automatic Scoring of Geometric Construction Answers with an Electronic Drawing Compass |
| 225 | Marry Kong Rina Buoy Sovisal Chenda Nguonly Taing Masakazu Iwamura Koichi Kise |
Towards Universal Khmer Text Recognition |
| 227 | Marry Kong Rina Buoy Sovisal Chenda Nguonly Taing Masakazu Iwamura Koichi Kise |
Towards Khmer Scene Document Layout Detection |
| 228 | Jan Philipp Bullenkamp Florian Linsel Lisa Wilhelmi Hubert Mara |
Synthetic Training Data Generation for 3D Cuneiform Sign Recognition |
| 230 | Stephan Unter Elena Hertel |
DDD – A Diagnostic Dataset for Character Recognition and Detection on Ancient Egyptian Hieratic Characters and Words |
| 231 | Shree Mitra Ajoy Mondal C. V. Jawahar |
Can VLMs Understand Handwritten Mathematical Documents? |
| 232 | MIN SONG Kenny Davila |
Synthetic Data from Simulated Lecture Environments for Handwritten Content Extraction |
| 234 | Sanket Deshmukh Apurva Gala David Blom Detlef Hohl |
Towards Scalable Knowledge Graph Extraction from Piping and Instrumentation Diagrams |
| 239 | Koki Fujita Hideaki Yajima Chee Siang Leow Hiromitsu Nishizaki |
Reference-Free Handwritten Japanese Character Generation via CLIP-Conditioned Diffusion Models |
| 242 | Laziz Hamdi Amine Tamasna Pascal Boisson Thierry Paquet |
FastTab: A Fast Table Recognizer with a Tiny Recursive Module and 1D Transformers |
| 247 | Mengyuan Zhao Kun Xu Xin Cheng Ting Li Qiuman Tan Xinyao Zhang |
DocCenter: Center and Corner Aware Representation for Robust Multi-Document Localization |
| 250 | Salman K H Chakravarthy Bhagvati |
BinDiffuser: Learning Binary Style Priors to Guide Diffusion Models for Palm-Leaf Document Binarization |
| 251 | Stanislas Bagnol Killian Barrere Veronique Eglin Elöd Egyed-Zsigmond David Pitaval Jean-Marie Côme |
GeoLogVQA: A Borehole Log Documents Dataset for Explicit and Implicit Spatial Reasoning |
| 252 | Tobias Steiner Merlin Streilein Andreas Fischer Kaspar Riesen |
Benchmarking Information Retrieval for Large Archives of Historical Documents |
| 256 | Zeynep Sonat Baltaci Raphael Baena Fei Meng Somkeo Norindr Florence Somer Matthieu Husson Mathieu Aubry |
Text region detection in historical astronomical diagrams |
| 258 | Nick Jochum Tobias Alt-Veit Christian Schön Alexander Lück René Schuster Didier Stricker |
Bounding Box Label Propagation for Re-Annotation of Document Layout Analysis Datasets |
| 259 | Takaya Kawakatsu | Revisiting Structural Dependency in Autoregressive Multi-Task Table Recognition via Order-Independent Cell-Level Representations |
| 260 | Ayush Lodh Souparni Mazumder Sanket Biswas Josep Llados Nisha Singh |
From Chunks to Graphs: Training-Free Multimodal Late Interaction for Document Understanding |
| 261 | MALAMATENIA VLACHOU EFSTATHIOU Raphaël Baena Dominique Stutzmann Mathieu Aubry |
Leveraging Morphology for Historical Script Metrological Analysis |
| 270 | Yngve Mardal Moe Marie Roald |
Stringalign: Moving beyond summary statistics with a transparent Unicode-aware tool for evaluating automatic transcription models |
| 280 | Merlin Streilein Tobias Steiner Andreas Fischer Kaspar Riesen |
Token Selection Strategies for Automatic Summarization of Historical Documents |
| 282 | Jihad Al Akl Chady Abou Jaoude Zahi Al Chami Marianne Abi Kanaan Abdallah Makhoul |
HIDRA: Hierarchical Ink-aware Dual-granularity Retrieval Architecture for Historical Fragments |
| 286 | Debayan Das Gupta Shivakumara Palaiahnakote Palash Ghosh Umapada Pal Cheng-Lin Liu |
Diffusion-Based Multi-View Reasoning for Scene Text Detection |
| 289 | Axel De Nardin Silvia Zottin Claudio Piciarelli Gian Luca Foresti |
GRaF-Net: a Multi-Branch Gated Residual Architecture for Floor Plan Semantic Segmentation |
| 298 | Stephan Unter Chang Liu Elisa Barney Smith |
Generalized Open-set Single-shot Character Recognition on Ancient Egyptian Hieratic Characters |
| 300 | Diego Belzarena Seginus Mowlavi Paula Casariego Castiñeira Alejandra Ulla Lorenzo Gregory Randall Jean-Michel Morel |
Theatre Chapbooks At Scale: A Statistical Comparative Analysis of Typography |
| 301 | Nam Nguyen Emanuela Boros Adam Jatowt Ahmed Hamdi Mickael Coustaty Antoine Doucet |
One Model, Many Guidelines: Instruction Fine-Tuning for Historical Named Entity Recognition |
| 303 | Dipendra Sharma Kafle Esma Talhi Mickael Coustaty Antoine Doucet |
RAGXDoc: Structured Knowledge-guided Retrieval and Explainable Re-ranking for Academic Documents |
| 304 | Glen Pouliquen Joseph Chazalon Guillaume Chiron Oriol Ramos Terrades Thierry Geraud Ahmad Montaser Awal |
Temporal Modeling of Optically Variable Devices in Identity Documents |
| 306 | François Wieckowiak Véronique Eglin Tony Bonnet Stéphane Bres Laëtitia Rousseau |
PatentME: A Dataset and Reference-Free Post-OCR Verification Task for Printed Mathematical Expression Recognition |
| 309 | Achyuth P Kahaan Shah Chetan Arora |
What Can Languages of the Global South Teach Each Other? |
| 311 | Robin Armingaud Romaric Besançon |
GLiDRE: Generalist Lightweight Model for Document-level Relation Extraction |
| 312 | Silvia Zottin Axel De Nardin Valentina Mignosa Maddalena Zunino Gian Luca Foresti |
Bridging the Gaps: Learning to Estimate Missing Text in Fragmentary Greek Inscriptions |
| 313 | Ari Vesalainen Eetu Mäkelä Laura Ruotsalainen Mikko Tolonen |
Error Patterns in Historical OCR: A Comparative Analysis of TrOCR and a Vision–Language Model |
| 314 | Amritansh Maurya Navjot Singh Mohammed Javed Omar Moured |
Efficient Table QA via TableGrid Navigation and Progressive Inference Prompting |
| 316 | Arthur Matei Tim Hallyburton Lukas Hennies Christoph Rass Gernot A. Fink |
Recent Advances in Information Extraction from Historical Archival Records |
| 318 | Tayyab Raza Syed Muhammad Taha Imam Adrian Ulges Ulrich Schwanecke Momina Moetesum Faisal Shafait |
LiteDoc: Distilling Large Document Models into Efficient Task-Specific Encoders |
| 320 | Nimol Thuon Jun Du Ranysakol Thuon Panhapin Theang |
Angkorian-KSI: A Multi-Task Benchmark for Khmer Stone Inscription Analysis |
| 321 | Kylian Ronfleux Corail Nicolas Sidere Guillaume Bernard Mickael Coustaty |
Improving Document Forgery Localization Robustness via Diverse JPEG Quantization Tables |
| 327 | Ibtissem HAJ ALI, Harold Mouchère |
Spatially-Grounded Gaussian-Prior Attention for Handwritten Mathematical Expression Recognition |
| 330 | Saima Kausar Ayesha Amjad Ahmad Sarmad Ali Momina Moetesum Adnan ul Hasan Faisal Shafait |
DiffusionRec: Recognition-Guided Diffusion for Content-Aware Urdu Handwriting Generation |
| 339 | Swagata Mukherjee Samar Kumar Srivastava Sriparna Saha |
TimeAgent: From Matches to Memories — Timeline Summarization for Sports Analytics |
| 341 | Daichi Haraguchi | Structural Analysis of Character Identity at OCR Decision Boundaries in Visually Similar Pairs |
| 342 | Yiming Xu Eric López Artemis Llabrés Maximiliano Hormazábal Ernest Valveny Dimosthenis Karatzas |
AdaNav: Query-Adaptive Multi-Granularity Navigation for Long Document Understanding |
| 347 | Jakob Seitz Tobias Lengfeld Radu Timofte |
InkTree: A Unified Representation of Structured Online Ink |
| 349 | Bernhard Ortbauer Tobias Doppler Pauline Schmidt Lukas Schilcher Wolfgang Göderle Malte Rehbein Alexander Werth Roman Kern |
ADV-FORMS: A Dataset of Form-Based Historical Documents With Benchmarks for Layout Analysis, HTR and OCR |
| 351 | Fahad Alotaibi Daulet Toibazar Renad Almusaad Ranya Alkahtani Haneen Alhomoud Asma Ibrahim Yazeed Alharbi Murtadha Aljubran Pedro Moreno |
Doc2Doc: Structure-Aware Generative Rendering for Bi-Directional Document Translation |
| 355 | Ali Hussain Rafay Ahmad Momina Moetesum Adnan Ul-Hasan Faisal Shafait |
Online Urdu Text-Line Recognition by Bridging Stroke Dynamics and Offline Representations |
| 356 | Karen Lee Dhanashree Balaram Seojun Shon Umair Rasheed |
MIDAS: Multi-LLM Iterative Data-Adaptive Summarization |
| 370 | Keito Sasagawa Shuhei Kurita Daisuke Kawahara |
Synth-JDoc: Synthesizing a Japanese Document Image Dataset for OCR with Diverse Layouts and Embedded Images |
| 379 | Hiroki Nagamatsu Shoji Toyota Seiichi Uchida |
Handwriting Trajectory Recovery with Diffusion Models |
| 380 | Nour Atamni Boraq Madi Islam Amar Raid Saabni Jihad El-Sana |
Beyond Labels: Visual Invariance in Self-Supervised Learning for Aramaic Incantation Bowls |
| 384 | Hira Masood Momina Moetesum Muhammad Imran Malik Faisal Shafait Hassan Aqeel Khan |
Agentic Document Reasoning for Evidence-Grounded Clinical Report Generation |
| 385 | Rajat Verma Vriti Sharma Manikandan Ravikiran Rohit Saluja |
MultiFOLD: A Multimodal Framework to correct OCR Lapses in cluttered Documents |
| 387 | Jawad Ibn Ahad Mritunjoy Chakraborty Fuad Rahman Sifat Momen Shafin Rahman Nabeel Mohammed |
Figures as Evidence: Multi-Image Scientific Generation |
| 391 | Yifan Huang Liangrui Peng Tianqi Zhao Di Wu Kemeng Zhao Shuo Li Zhiyu Li Yuyang Li |
Vision-Language Model based Transfer Learning for Historical Document Recognition |
| 393 | Nam Tuan Ly Atsuhiro Takasu Masaki Nakagawa |
Multilingual Table Recognition: A Benchmark Dataset and A Local–Global Hybrid Model |