AlphaPen

Optical character recognition (OCR) system

Type of project

Technology
transfer

Duration

Official website

The project at a glance

AlphaPen and CATIE worked together to develop an intelligent system capable of automatically transcribing handwritten text.

This system, based on advanced artificial intelligence techniques, had to meet a dual challenge: ensuring high accuracy in handwriting recognition, while remaining lightweight enough to run locally on low-power devices such as smartphones or laptops.

The goal was to make this technology accessible to a broad audience, without requiring server resources or an internet connection.

Achievements

  • First handwritten OCR model in the French language, a notable breakthrough in a field still underdeveloped for this language.
  • Deployment of a functional prototype of the complete pipeline, combining OCR and automatic text correction.
  • Significant improvement in the readability and accuracy of the final text output produced by the system.
  • Development of a lightweight, locally deployable version, suitable for offline use on mobile devices or computers.

CATIE contribution

  • Development of a generative handwritten image model, used to augment training datasets and improve the robustness of the recognition system.

  • Creation and specialization of an OCR (Optical Character Recognition) model, specifically trained for the automatic extraction of handwritten text in the French language.

  • Design of a linguistic post-processing module capable of correcting detected errors in the extracted text (spelling mistakes, missing punctuation, syntactic inconsistencies, etc.).

  • Optimization of the developed models by reducing their size and accelerating their execution, to ensure smooth integration on mobile or embedded devices without compromising transcription quality.

Expertise used