Document Indexing using external tools

by
Odoo

106.32

v 11.0 v 12.0 Third Party 1
Availability
Odoo Online
Odoo.sh
On Premise
Lines of code 74
Technical Name document-indexing
LicenseSee License tab
Versions 12.0 11.0
You bought this module and need support? Click here!
Availability
Odoo Online
Odoo.sh
On Premise
Lines of code 74
Technical Name document-indexing
LicenseSee License tab
Versions 12.0 11.0

Better document indexing using external tools

This module replaces the default attachment indexing logic using calls to external tools. This enables Odoo to index many more file types for fulltext search, moreover the indexing of .pdf files is far more reliable.

In order to use this module, you have to install libreoffice and poppler-utils to your operating system first.

Make sure you can run "soffice" and "pdftotext" from command line.

The module indexes these document types:

  • .doc,
  • .docx,
  • .pdf,
  • .xls,
  • .xlsx,
  • .odp,
  • .ods,
  • .odt,
  • .wps (MS Works),
  • .rtf,
  • .ppt,
  • .pptx

Other types are passed to default processing by ir.attachment model.

The indexing works in two steps. First, the document is translated to .pdf using "soffice --headless" command, then the text is extracted from the .pdf by using "pdftotext" from poppler-utils.

Please take a look into /tmp directory from time to time, since there can still be some orphan files that you may want to delete manually. These files can remain in /tmp if the conversion to .pdf crashes for some reason.

Installation

How to install under docker image with odoo11

  1. download and run odoo image
  2. docker exec -it -u root odoo12 /bin/bash
  3. apt-get update
  4. apt-get install libreoffice poppler-utils
  5. cd to Odoo addons path
  6. unpack the module
  7. in Odoo, update modules list and install the module
  8. done - test it

How to install to on-premise Odoo installation under linux

  1. apt-get install libreoffice poppler-utils
  2. cd to Odoo addons path
  3. unpack the module
  4. in Odoo, update modules list and install the module
  5. done - test it

How to install to on-premise Odoo installation under Windows

...no idea, just try to make sure that you have everything needed to be able to run "soffice" and "pdftotext" commands from command line.

Creadits: the icon is named Search and its author is Igé Maulana from the Noun Project

MIT License

Copyright (c) 2018 Jan B. Krejčí

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Please log in to comment on this module

  • The author can leave a single reply to each comment.
  • This section is meant to ask simple questions or leave a rating. Every report of a problem experienced while using the module should be addressed to the author directly (refer to the following point).
  • If you want to start a discussion with the author or have a question related to your purchase, please use the support page.
There are no ratings yet!
by
Johannes Bacher
on 12/16/21, 8:41 AM

Dear Jan, we have Odoo 15 and I wonder if your tool will work there? 

Our standard indexing works, only PDFs are not indexed. We run Odoo on Ubuntu, do you have any hints why PDF indexing fails? How can I check if pdftotext is working?

thank you

Johannes

Re:
by
Jan B. Krejčí
on 12/16/21, 10:14 AM Author

Dear Johannes,

I haven't tried my module on Odoo newer than 12, yet.

What do you mean by "our standard indexing works"? Are you using my module with Odoo 15?

To check if pdftotext works I would recommend trying using it in commandline manually. See https://manpages.debian.org/stretch/poppler-utils/pdftotext.1.en.html

Please feel free to contact me by e-mail janbkrejci (at) gmail (dot) com

Best regards

Jan