pythonでWebカメラから文字認識を行う - 元高専生のロボット作り

OCR(Optical Character Recognition/Reader)ってやつです。

Anacondaを使用してます

まずはモジュールのインストール

conda install -c brianjmcguirk pyocr

OCRツールのtesseractをいれます。

ここにいろいろ書いてある。
github.com

学習済みデータを保存しておく必要があります。
英語用のeng.traineddataと日本語用のjpn.traineddataをとりあえずダウンロードして保存します。

私の場合はAnacondaを使用していたので以下のディレクトリに保存しました。
C:\Users\myName\Anaconda3\envs\myEnv\Library\bin\tessdata

まずはWebカメラなしで画像から文字認識をできるか確認

from PIL import Image
import cv2
import sys
import pyocr
import pyocr.builders
import time

tools = pyocr.get_available_tools()
if len(tools) == 0:
    print("No OCR tool found")
    sys.exit(1)

tool = tools[0]
langs = tool.get_available_languages()
lang = langs[0]
last_txt = ""

frame = cv2.imread("OCRtest_imgae.png")
orgHeight, orgWidth = frame.shape[:2]
size = (int(orgWidth/4), int(orgHeight/4))
glay = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
image = cv2.resize(glay, size)
txt = tool.image_to_string(
    Image.fromarray(image),
    lang="jpn",
    builder=pyocr.builders.TextBuilder(tesseract_layout=6)
)
if len(txt) != 0 and txt != last_txt:
    last_txt = txt
    print( txt )

画像はこれね
f:id:sgrsn1711:20190107191517p:plain
画像の名前はOCRtest_imgae.pngとしてください。

WebカメラでOCR

画像でできることを確認したので、画像読み取りのところをwebカメラに置き換える
以下のようになります。

Webカメラですが、私はLogicoolのc270mをAmazonから購入しました。
今では価格が上がっていますね。まあ安心感はあるでしょうけど

<br />

他のだと、webカメラは今のところこれが安くてよさげ

<br />

コード

from PIL import Image
import cv2
import sys
import pyocr
import pyocr.builders
import time

tools = pyocr.get_available_tools()
if len(tools) == 0:
    print("No OCR tool found")
    sys.exit(1)
tool = tools[0]

langs = tool.get_available_languages()
lang = langs[0]
capture = cv2.VideoCapture(0)
last_txt = ""
while True:
    ret, frame = capture.read()
    orgHeight, orgWidth = frame.shape[:2]
    size = (int(orgWidth/4), int(orgHeight/4))
    glay = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    image = cv2.resize(glay, size)
    t = time.time()
    txt = tool.image_to_string(
        Image.fromarray(image),
        lang="eng",
        builder=pyocr.builders.TextBuilder(tesseract_layout=6)
    )
    #print(time.time() - t)
    if len(txt) != 0 and txt != last_txt:
        last_txt = txt
        print( txt )

    cv2.imshow("Capture", image)
       
    if cv2.waitKey(33) >= 0:
        break

cv2.destroyAllWindows()