Questions tagged [python-tesseract]
Python-tesseract is a wrapper class for Tesseract OCR that allows any conventional image files (JPG, GIF, PNG, TIFF, etc.) to be read and get its text, data of text, or even convert it to pdf.
python-tesseract
1,714
questions
0
votes
0
answers
28
views
PyTesseract not extracting text?
Pytesseract does not extract the text from the image. The terminal stays black with a space as if it was actually trying to extract the text.
Here is my code and the image:
from PIL import Image
...
0
votes
0
answers
19
views
optimize OCR text detection in image in python
I currently am writing a program in python that takes an image that has a lot of text, extracts this to a .txt file, then compares the found words with a list of words in another file and creates ...
-1
votes
0
answers
22
views
Splicing Relavent Text from a Screenshot using pytesseract and ocr for Scheduling script
Hi I'm currently making a script that can take screenshots of a university class schedule and automatically sync it to either google calender or outlook calendar.
from PIL import Image
import ...
0
votes
0
answers
28
views
How can I extract tables from an image into excel using optical character recognition?
As an example, I have this image and will like to convert this to an modifiable excel table. In have tried using the 'pytesseract' library, but it doesn't accurately extract the text from the image ...
-1
votes
0
answers
17
views
Extracting text from an TMT (thread mill test) report [closed]
I have a task that involves extracting specific values from a TMT PNG image of a table. Depending on the required output, I need to extract either a specific value from a table cell or some text from ...
0
votes
1
answer
62
views
How to recognize single characters from an image using Tesseract?
This is the original image:
This is the processed image:
I'm trying to automate a mini-game, in which characters appear on the screen. I did some light reaserch and managed to process the image to ...
0
votes
0
answers
46
views
OpenCV contours sorting x-axis and y-axis
I am working on a python program to solve a wordsearch. I am using pytesseract and opencv to process an image of the wordsearch and the solution will be displayed as a text. The script processes the ...
0
votes
0
answers
3
views
Can't find configuration by homebrew for training tesseract
These code help me install library for training tesseract.
export PKG_CONFIG_PATH=\
$(brew --prefix)/lib/pkgconfig:\
$(brew --prefix)/opt/libarchive/lib/pkgconfig:\
$(brew --prefix)/opt/icu4c/lib/...
0
votes
0
answers
24
views
How to make tesseract (pytesseract) recognise '±'?
Plus or minus character
I'm trying to detect text (mostly numbers) from an image (technical diagram). Do I need to train (if yes, how) tesstrain? jtessboxeditor?
On doing the OCR from a set of ...
0
votes
1
answer
59
views
Getting numbers from matrix image using pytesseract
I am trying to retrieve the text from an image that is a matrix 4x4. The text are numbers. Although I was expecting the numbers all I got was: BE, 8, EEE, BE. The image is attached here: image
Anyone ...
1
vote
1
answer
43
views
Pytesseract OCR recognizes "o" as "0"
I'm trying to read text on this image using pytesseract library.
original-screenshot.png
Here is my code:
path = 'original-screenshot.png'
image = cv2.imread(path)
image = cv2.cvtColor(image, cv2....
1
vote
0
answers
35
views
I don't want the boxes to be read as special character or letters
This is the image:
This is the sample image that i will convert into text.
And here is the output:
***"|
| .**
indicators (Bids:
S.1.4.1. valid Certificate of Registration and **LJ Poy |**
...
-1
votes
0
answers
16
views
Pytesseract, MacOS, VScode: Issue with using tesseract despite it seemingly being downloaded
I'm new to python and coding, and I am writing a piece of code to take a look at a ruler's numbers using pytesseract. Here's a short snippet of code including everything relevant to my issue:
import ...
0
votes
1
answer
72
views
Extract text from table in a image with
i want to extract data from this table in this image, i use cv2 and
pytesseract but I don't get reliable results. This is my code and my image.
import cv2
import pytesseract
from PIL import Image
...
0
votes
0
answers
20
views
How to Improve Extraction of English Questions from Scanned UPSC PDFs with OCR?
I'm working on a project to extract English questions from scanned UPSC exam papers (prelims and mains). These papers include both English and Hindi text. Additionally, the text quality in the PDFs ...