• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
WebSetNet

WebSetNet

Technology News

  • Technology News
    • Mobile
    • Games
  • Internet Marketing
  • System Admin
    • Windows
    • Linux
    • Mac & Apple
    • Website Scripts
      • Wordpress

Use gImageReader to Extract Text From Images and PDFs on Linux

March 25, 2021 by bartez64

Brief: gImageReader is a GUI tool to utilize tesseract OCR engine for extracting texts from images and PDF files in Linux.

gImageReader is a front-end for Tesseract Open Source OCR Engine. Tesseract was originally developed at HP and then was open-sourced in 2006.

Basically, the OCR (Optical Character Recognition) engine lets you scan texts from a picture or a file (PDF). It can detect several languages by default and also supports scanning through Unicode characters.

However, the Tesseract by itself is a command-line tool without any GUI. So, here, gImageReader comes to the rescue to let any user utilize it to extract text from images and files.

Let me highlight a few things about it while mentioning my experience with it for the time I tested it out.

gImageReader: A Cross-Platform Front-End to Tesseract OCR

gimagereader

To simplify things, gImageReader comes in handy to extract text from a PDF file or an image that contains any kind of text.

Whether you need it for spellcheck or translation, it should be useful for a specific group of users.

To sum up the features in a list, here’s what you can do with it:

  • Add PDF documents and images from disk, scanning devices, clipboard and screenshots
  • Ability to rotate images
  • Common image controls to adjust brightness, contrast, and resolution
  • Scan images directly through the app
  • Ability to process multiple images or files in one go
  • Manual or automatic recognition area definition
  • Recognize to plain text or to hOCR documents
  • Editor to display the recognized text
  • Can spellcheck the text extracted
  • Convert/Export to PDF documents from hOCR document
  • Export extracted text as a .txt file
  • Cross-platform (Windows)

Installing gImageReader on Linux

Note: You need to explicitly install Tesseract language packs to detect from images/files from your software manager.

tesseract language pack

You can find gImageReader in the default repositories for some Linux distributions like Fedora and Debian.

For Ubuntu, you need to add a PPA and then install it. To do that, here’s what you need to type in the terminal:

sudo add-apt-repository ppa:sandromani/gimagereader
sudo apt update
sudo apt install gimagereader

You can also find it for openSUSE from its build service and AUR will be the place for Arch Linux users.

All the links to the repositories and the packages can be found in their GitHub page.

gImageReader

Experience with gImageReader

gImageReader is a quite useful tool for extracting texts from images when you need them. It works great when you try from a PDF file.

For extracting images from a picture shot on a smartphone, the detection was close but a bit inaccurate. Maybe when you scan something, recognition of characters from the file could be better.

So, you’ll have to try it for yourself to see how well it works for your use-case. I tried it on Linux Mint 20.1 (based on Ubuntu 20.04).

I just had an issue to manage languages from the settings and I didn’t get a quick solution for that. If you encounter the issue, you might want to troubleshoot it and explore more about it how to fix it.

gimagereader 1

Other than that, it worked just fine.

Do give it a try and let me know how it worked for you! If you know of something similar (and better), do let me know about it in the comments below.

Original Article

Related posts:

  1. Tesseract OCR: Installation and Usage on Ubuntu 16.04
  2. The 6 Best PDF Editors for Windows 10 in 2021
  3. 3 Best Online OCR Tools To Extract Text From Images
  4. How to Insert a PDF into PowerPoint
  5. How to edit PDFs in Microsoft Word
  6. 20+ Free Books To Learn Linux For Free
  7. 7 Best PDF Editor for Windows 2018 (Paid & Free)
  8. Split, Merge, and Mix PDF files in Ubuntu via PDF Mix Tool
  9. 20+ Free eBooks To Learn Linux For Free
  10. 8 Best Google Chrome PDF Editor Add-Ons

Filed Under: Linux

Primary Sidebar

Trending

  • How to fix Windows Update Error 80244019
  • Windows 10 Update keeps failing with error 0x8007001f – 0x20006
  • How To Change Netflix Download Location In Windows 10
  • Troubleshoot Outlook “Not implemented” Unable to Send Email Error
  • How do I enable or disable Alt Gr key on Windows 10 keyboard
  • How To Install Android App APK on Samsung Tizen OS Device
  • 3 Ways To Open PST File Without Office Outlook In Windows 10
  • FIX: Windows Update error 0x800f0986
  • How to Retrieve Deleted Messages on Snapchat
  • Latest Samsung Galaxy Note 20 leak is a spec dump revealing key features
  • Install Android 7.0 Nougat ROM on Galaxy Core 2 SM-G355H
  • 192.168.1.1 Login, Admin Page, Username, Password | Wireless Router Settings
  • Websites to Watch Movies Online – 10+ Best Websites Without SignUp/Downloading
  • How to Backup SMS Messages on Your Android Smartphone
  • How to delete a blank page at the end of a Microsoft Word document
  • Fix: The Disc Image File Is Corrupted Error In Windows 10
  • Android 11 Custom ROM List – Unofficially Update Your Android Phone!
  • Samsung Galaxy Z Fold 3 could be scheduled for June 2021, with S Pen support

Footer

Tags

Amazon amazon prime amazon prime video Apple Application software epic games Galaxy Note 20 Galaxy S22 Plus Galaxy S22 Ultra Google Sheets headphones Huawei icloud Instagram instant gaming ip address iPhone iphone 12 iphone 13 iphone 13 pro max macOS Microsoft Microsoft Edge Mobile app office 365 outlook Pixel 6 Samsung Galaxy Samsung Galaxy Book 2 Pro 360 Samsung Galaxy Tab S8 Smartphone speedtest speed test teams tiktok Twitter vpn WhatsApp whatsapp web Windows 10 Windows 11 Changes Windows 11 Release Windows 11 Update Windows Subsystem For Android Windows 11 Xiaomi

Archives

  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org