«ABYY Fine reader OCP» module

The «ABYY fine reader OCP» works with PDF files. Unlike the previous module, it has higher accuracy and speed.

The recognition result is recorded in an object variable, in which you can refer to a specific row, table, word.

Module interface

This module consists of two fields – «Pth to pdf file» and «Variable».

![Screenshot](img/FROCR_1.png)

The «Path to pdf file» field. The path to the picture/file you wanna recognise is specified here.

The «Variable» field. The name of a variable is put into this field.

A PDF file may consist of several pages, so the variable will be a list of pages. For example, @text [0] is the first recognized page of a file.

All recognized text is divided into two parts - the main text and the tabular part ## Main text To access the text, you must refer to the rows_word [0] field (in this record, [0] means that the first text part is being accessed. In other versions of ABBYY, a larger text splitting is possible). Then it is indicated, as a separate field, the number of the line, and separated by a dot the word number in the line - @text[0].rows_word[0].2.4. Then, after selecting the right word, you can get its value, by referring to the value field - @text[0].rows_word[0].2.4.value – this line will return the value of the word, located on the first page of the recognised document, in the second line and is fourth on the left. ## The tabular part

To access the field you need to refer to the tables[0] field ([0] is needed for the proper work of the robot, its semantic part is embedded in the ABBYY product). Then, the number of the table is indicated as a separate field, because there can be several tables on one sheet - @text[0].tables[0].0. After that, the number of the cell in the table is entered as a single number. If you want to find a cell and know its column and row number, you can get the index of the cell in the variable in the form of "row, column" by referring to the the index field - @text [0].tables[0].0.5.index (5 - fifth cell in an array of recognized cells). To get a values from a cell, you need to refer to its value, and then either collect the entire string written to the cell, or refer to a specific word by its serial number - @text[0].tables[0].0.5.value.1.value - the value of the second word from the fifth cell of the first table from the first page of the recognized text.

To get the amount of cells in the table you can use this construction - @text[0].tables[0].0[%], specify the set of characters «[%]» after the field after which there is a field, the number of elements of which you need to find .

To get the amount of words in a cell you can use this construction - @text[0].tables[0].0.5[%].