OCR for Documents (image and PDF)by Therp BV, Odoo Community Association (OCA), ThinkOpen Solutions Brasil
|Read description for||v 11.0|
OCR for documents
This module was written to make uploaded documents, for example scans, searchable by running OCR on them.
It supports all image formats Pillow supports for reading and PDFs.
InstallationTo install this module, you need to:
- install tesseract and the language(s) your documents use
- if you want to support OCR on PDFs, install imagemagick
- install the module itself
- $ sudo apt-get install tesseract-ocr imagemagick
ConfigurationTo configure this module, go to:
- Settings/Technical/Parameters/System parameters and review the parameters with names document_ocr.*
UsageBy default, character recognition is done asynchronously by a cronjob at night. This is because the recognition process takes a while and you don't want to make your users wait for the indexation to finish. The interval to run the cronjob can be adjusted to your needs in the Scheduled Actions menu, under ` Settings`. In case you want to force the OCR to be done immediately, set configuration parameter document_ocr.synchronous to value True.
By default, recognition language is set to english. In case you want to use a different default, set configuration parameter document_ocr.language to value respective value ex:por, for Portuguese.
In PDF case, OCR will run after it will be converted to an image. But OCR will be applied to all PDFs.
System parameters used:
- document_ocr.synchronous: bool
- document_ocr.language: string
- document_ocr.dpi: integer
- document_ocr.quality: integer
- Holger Brunn email@example.com
- Carlos Almeida firstname.lastname@example.org
MaintainerThis module is maintained by ThinkOpen Solutions Brasil.
Please log in to comment on this module
- The author can leave a single reply to each comment.
- This section is meant to ask simple questions or leave a rating. Every report of a problem experienced while using the module should be addressed to the author directly (refer to the following point).
- If you want to start a discussion with the author, please use the developer contact information. They can usually be found in the description.