python - Captcha preprocessing and solving with Opencv and pytesseract -


problem

i trying write code in python image preprocessing , recognition using tesseract-ocr. goal solve form of captcha reliably.

original captcha , result of each preprocessing step

steps of now

  1. greyscale , thresholding of image

  2. image enhancing pil

  3. convert tif , scale >300px

  4. feed tesseract-ocr (whitelisting uppercase alphabets)

however, still rather incorrect reading (epq m q). other preprocessing steps can take improve accuracy? code , additional captcha of similar nature appended below.

similar captchas want solve

code

import cv2 import pytesseract pil import image, imageenhance, imagefilter def binarize_image_using_opencv(captcha_path, binary_image_path='input-black-n-white.jpg'):      im_gray = cv2.imread(captcha_path, cv2.imread_grayscale)      (thresh, im_bw) = cv2.threshold(im_gray, 85, 255, cv2.thresh_binary)      # although thresh used below, gonna pick suitable      im_bw = cv2.threshold(im_gray, thresh, 255, cv2.thresh_binary)[1]      cv2.imwrite(binary_image_path, im_bw)       return binary_image_path  def preprocess_image_using_opencv(captcha_path):      bin_image_path = binarize_image_using_opencv(captcha_path)       im_bin = image.open(bin_image_path)      basewidth = 300  # in pixels      wpercent = (basewidth/float(im_bin.size[0]))      hsize = int((float(im_bin.size[1])*float(wpercent)))      big = im_bin.resize((basewidth, hsize), image.nearest)       # tesseract-ocr works tif save bigger image in format      tif_file = "input-nearest.tif"      big.save(tif_file)       return tif_file  def get_captcha_text_from_captcha_image(captcha_path):       # preprocess image befor ocr      tif_file = preprocess_image_using_opencv(captcha_path)    get_captcha_text_from_captcha_image("path/captcha.png")  im = image.open("input-nearest.tif") # second 1  im = im.filter(imagefilter.medianfilter()) enhancer = imageenhance.contrast(im) im = enhancer.enhance(2) im = im.convert('1') im.save('captchafinal.tif') text = pytesseract.image_to_string(image.open('captchafinal.tif'), config="-c  tessedit_char_whitelist=abcdefghijklmnopqrstuvwxyz -psm 6") print(text) 

major problem comes different orientations of letters, not preprocessing stage. did common preprocessing should work good, can replace thresholding adaptive thresholding make program more general in sense of brightness of image.

i met same problem when working tesseract car license recognition. experience realized tesseract sensetive orientation of text on image. tesseract can recognize letters when text on image horizontal. more text horizontally oriented better result can get.

so have create algorithm detect each letter captcha image, detect orientation , rotate make horizontal , preprocessing, process rotated horizontal piece of image tesseract , store output in resulting string. go detect next letter , same process , add tesseract output in resulting string. need image transformation function well, rotate letters. , have think finding corners of detected letters. may this project you, because rotating text on image improve quality of tesseract.


Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -