AlphaPen
Optical character recognition (OCR) system
The project at a glance
AlphaPen and CATIE worked together to develop an intelligent system capable of automatically transcribing handwritten text.
This system, based on advanced artificial intelligence techniques, had to meet a dual challenge: ensuring high accuracy in handwriting recognition, while remaining lightweight enough to run locally on low-power devices such as smartphones or laptops.
The goal was to make this technology accessible to a broad audience, without requiring server resources or an internet connection.
Achievements
- First handwritten OCR model in the French language, a notable breakthrough in a field still underdeveloped for this language.
- Deployment of a functional prototype of the complete pipeline, combining OCR and automatic text correction.
- Significant improvement in the readability and accuracy of the final text output produced by the system.
- Development of a lightweight, locally deployable version, suitable for offline use on mobile devices or computers.
CATIE contribution
-
Development of a generative handwritten image model, used to augment training datasets and improve the robustness of the recognition system.
-
Creation and specialization of an OCR (Optical Character Recognition) model, specifically trained for the automatic extraction of handwritten text in the French language.
-
Design of a linguistic post-processing module capable of correcting detected errors in the extracted text (spelling mistakes, missing punctuation, syntactic inconsistencies, etc.).
-
Optimization of the developed models by reducing their size and accelerating their execution, to ensure smooth integration on mobile or embedded devices without compromising transcription quality.