python - Captcha preprocessing and solving with Opencv and pytesseract -
problem
i trying write code in python image preprocessing , recognition using tesseract-ocr. goal solve form of captcha reliably.
original captcha , result of each preprocessing step
steps of now
greyscale , thresholding of image
image enhancing pil
convert tif , scale >300px
feed tesseract-ocr (whitelisting uppercase alphabets)
however, still rather incorrect reading (epq m q). other preprocessing steps can take improve accuracy? code , additional captcha of similar nature appended below.
code
import cv2 import pytesseract pil import image, imageenhance, imagefilter def binarize_image_using_opencv(captcha_path, binary_image_path='input-black-n-white.jpg'): im_gray = cv2.imread(captcha_path, cv2.imread_grayscale) (thresh, im_bw) = cv2.threshold(im_gray, 85, 255, cv2.thresh_binary) # although thresh used below, gonna pick suitable im_bw = cv2.threshold(im_gray, thresh, 255, cv2.thresh_binary)[1] cv2.imwrite(binary_image_path, im_bw) return binary_image_path def preprocess_image_using_opencv(captcha_path): bin_image_path = binarize_image_using_opencv(captcha_path) im_bin = image.open(bin_image_path) basewidth = 300 # in pixels wpercent = (basewidth/float(im_bin.size[0])) hsize = int((float(im_bin.size[1])*float(wpercent))) big = im_bin.resize((basewidth, hsize), image.nearest) # tesseract-ocr works tif save bigger image in format tif_file = "input-nearest.tif" big.save(tif_file) return tif_file def get_captcha_text_from_captcha_image(captcha_path): # preprocess image befor ocr tif_file = preprocess_image_using_opencv(captcha_path) get_captcha_text_from_captcha_image("path/captcha.png") im = image.open("input-nearest.tif") # second 1 im = im.filter(imagefilter.medianfilter()) enhancer = imageenhance.contrast(im) im = enhancer.enhance(2) im = im.convert('1') im.save('captchafinal.tif') text = pytesseract.image_to_string(image.open('captchafinal.tif'), config="-c tessedit_char_whitelist=abcdefghijklmnopqrstuvwxyz -psm 6") print(text)
major problem comes different orientations of letters, not preprocessing stage. did common preprocessing should work good, can replace thresholding adaptive thresholding make program more general in sense of brightness of image.
i met same problem when working tesseract car license recognition. experience realized tesseract sensetive orientation of text on image. tesseract can recognize letters when text on image horizontal. more text horizontally oriented better result can get.
so have create algorithm detect each letter captcha image, detect orientation , rotate make horizontal , preprocessing, process rotated horizontal piece of image tesseract , store output in resulting string. go detect next letter , same process , add tesseract output in resulting string. need image transformation function well, rotate letters. , have think finding corners of detected letters. may this project you, because rotating text on image improve quality of tesseract.
Comments
Post a Comment