OCR SDK Knowledge Base

Article ID: 1157 | Category: Programming Aspects | Type: How To | Last Modified: 2/20/2013

Set the Alphabet

Description

How can I recognize text that contains only numbers?

Solution

There are two ways to do it:

A. Use Predefined Text Languages

Use the Digits predefined text language.

Note that in this case the recognized text can contain not only digits, but also several characters such as #, %, &, $ etc.

Sample code in C#:

//load image
FREngine.FRDocument document = engineLoader.Engine.CreateFRDocumentFromImage(@"D:\Demo.tif");
 
//process the document using the created language
FREngine.PageProcessingParams p = engineLoader.Engine.CreatePageProcessingParams();
p.RecognizerParams.SetPredefinedTextLanguage("Digits");
document.Process(p);
 
//export the document
document.Export(@"D:\export.rtf", FREngine.FileExportFormatEnum.FEF_RTF, null); 

 

B. Specify the alphabet directly

  1. –°reate BaseLanguage object and set its alphabet
  2. –°reate corresponding TextLanguage object
  3. Process the document using the created language

Sample code in C#:

//create BaseLanguage object and set its alphabet
FREngine.BaseLanguage baseLang = engineLoader.Engine.CreateBaseLanguage();
baseLang.set_LetterSet(FREngine.BaseLanguageLetterSetEnum.BLLS_Alphabet, "0123456789");
 
//create corresponding TextLanguage object
FREngine.TextLanguage textLang = engineLoader.Engine.CreateTextLanguage();
textLang.BaseLanguages.Add(baseLang);
 
//load image
FREngine.FRDocument document = engineLoader.Engine.CreateFRDocumentFromImage(@"D:\Demo.tif");
 
//process the document using the created language
FREngine.PageProcessingParams p = engineLoader.Engine.CreatePageProcessingParams();
p.RecognizerParams.TextLanguage = textLang;
document.Process(p);
 
//export the document
document.Export(@"D:\export.rtf", FREngine.FileExportFormatEnum.FEF_RTF, null); 
351 people think this is helpful.
Was this information helpful to you?