OCR SDK Knowledge Base

Article ID: 1160 | Category: Export | Type: How To | Last Modified: 6/19/2014

XML schema


How to set up XML scheme?


When you export to XML, you can either use the default XML schema or create a custom one.

Default XML Schema

See the description of the default XML schema in Help > Index > XML Scheme (this path is for FR Engine 10's help).

The schema itself can be found in the Inc folder (Start > Programs > ABBYY FineReader Engine 10 > Installation Folders > Include Files Folder).

Modifying the Default XML Schema

You can use the following properties of the XMLExportParams object to add or remove some elements from the default XML schema:

  • WriteCharacterRecognitionVariants
  • WriteCharAttributes
  • WriteCharFormatting
  • WriteNondeskewedCoordinates
  • WriteWordRecognitionVariants

You can find a description of this process in Help > Parameter Objects > Export Parameters > XML Export Params (this path is for FR Engine 10's help).

Custom XML Schema

It is also possible to use a custom schema. You can create one by simply typing it in a text editor such as StreamWriter and saving it as an XML file.

This is illustrated by the code sample below.

The sample code does the following:

  • saves pictures from a recognized document separately;
  • writes an XML file that looks like

Visual Basic Sample Code

Imports System.IO.StreamWriter
Private Sub Export(ByVal FRDocument As FREngine.FRDocument, ByVal filePath As String)
        ' Declare a FileStream and create a xml document file named file with access mode of writing
        Dim fs As New FileStream(filePath, FileMode.Create, FileAccess.Write)
        ' Create a new StreamWriter and pass the filestream object fs as argument
        Dim s As New StreamWriter(fs)
        ' Write text to the newly created file
        s.WriteLine("<?xml version='1.0' encoding='UTF-8'?>")
        Dim imagesFolderName As String
        imagesFolderName = …
        Dim imagesPath As String
        Dim Blocks As FREngine.LayoutBlocks
        For PagesIndex As Integer = 0 To FRDocument.Pages.Count - 1
            Blocks = FRDocument.Pages(PagesIndex).Layout.Blocks
            For BlocksIndex As Integer = 0 To Blocks.Count - 1
                If Blocks(BlocksIndex).Type = FREngine.BlockTypeEnum.BT_RasterPicture Then
                    Dim ImageModification As FREngine.ImageModification
                    ImageModification = Engine.CreateImageModification
                    imagesPath = …
                    s.WriteLine("<" + imagesPath + ">")
                End If
            Next BlocksIndex
        Next PagesIndex
        ' Close the file
    End Sub
522 people think this is helpful.
Was this information helpful to you?