Opencv|DocumentScanning&OpticalCharacterRecognition资源-CSDN文库

68 浏览量 2020-12-21 12:30:46 上传评论收藏 580KB PDF 举报

Opencv|Document Scanning & Optical Character Recognition(OCR) Step 1. Import some packages and a pyfile named resize for the project. import cv2 import numpy as np import resize Step 2. Import and preliminary processing of the image. Read in the picture to be detected. If the resolution is good e 在OpenCV库中，文档扫描和光学字符识别（OCR）是两个重要的计算机视觉技术，用于自动识别和处理图像中的文本信息。以下是对这两个概念的详细解释： **文档扫描**：文档扫描通常指的是将纸质文档转换成数字图像的过程，以便于存储、传输或进一步处理。在OpenCV中，这个过程可以通过读取图像、预处理、边缘检测和轮廓提取来实现。导入必要的包，如cv2（OpenCV库）和numpy，以及一个名为resize的pyfile，用于调整图像大小。然后，使用cv2.imread()函数读取待检测的图片，并通过cv2.resize()进行尺寸调整，以适应处理需求。接着，将图像转换为灰度并应用高斯模糊以减少噪声，再用Canny算法进行边缘检测，找出图像的轮廓。 **光学字符识别（OCR）**： OCR是一种技术，它能识别图像中的文字并将其转换为可编辑、可搜索的数据。在OpenCV中，OCR通常涉及多个步骤，包括预处理、文字区域定位和字符分割。在上述代码中，先通过找到图像的最大轮廓来定位文档的边界，这通常是文档的主要部分。然后，使用函数rectify对图像进行校正，可能包括旋转和平移，确保文字方向正确。在实际应用中，OCR的最后一步是使用OCR引擎，如Tesseract，来识别图像中的单个字符。Tesseract是一个开源OCR引擎，可以与OpenCV结合使用，提高识别准确率。 **预处理**：预处理是OCR的关键步骤，因为它能提升后续字符识别的准确度。在上述代码中，预处理包括将图像转为灰度、应用高斯模糊以减少噪声、以及使用Canny算法进行边缘检测。这些操作有助于突出文字并去除背景干扰。 **轮廓提取**：轮廓提取是找到图像中具有特定特征（如矩形形状）的边界的手段。在代码中，使用cv2.findContours()函数找到边缘图像中的所有轮廓，并通过排序和面积比较来选取最大的轮廓，这通常是文档的边界。之后，使用approxPolyDP()函数来近似轮廓，找出四边形的轮廓，这通常代表了文档的边界框。 **图像校正**：函数`rectify()`可能用于校正图像的透视失真，使文档看起来更像正方形。这通过重新排列轮廓点并计算新的坐标来实现。通过这种方法，可以确保文字区域在处理时保持直立，从而提高OCR的准确性。总结来说，OpenCV中的文档扫描和OCR流程主要包括图像读取、预处理、轮廓提取和图像校正，最后配合OCR引擎识别图像中的文字。这一系列操作对于自动化处理大量纸质文档非常有用，例如在档案管理、文本挖掘或自动填写表单等场景。

资源推荐

资源详情

资源评论

Opencv|Document Scanning & Optical Character Recognition

Opencv|Document Scanning & Optical Character Recognition(OCR)

Step 1. Import some packages and a pyfile named resize for the project.

import cv2

import numpy as np

import resize

Step 2. Import and preliminary processing of the image.

Read in the picture to be detected. If the resolution is good enough, we can also use the laptop camera.

image = cv2.imread('test.jpg')

image = cv2.resize(image, (1500, 1125))

orig = image.copy()

# Create a copy of the original image.

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

blurred = cv2.GaussianBlur(gray, (5, 5), 0)

# Grayscale the image, and then perform line Gaussian blur to reduce noise

edged = cv2.Canny(blurred, 0, 50)

# Use canny algorithm for edge detection

orig_edged = edged.copy()

# Create a copy processed by the canny algorithm.

Step 3. Get approximate contours of the image.

Find the outline in the edge image, keep only the largest one, and initialize the screen outline.

contours, hierarchy = cv2.findContours(edged, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)

# findContours() for finding contours from binary images

contours = sorted(contours, key=cv2.contourArea, reverse=True)

# Use the sorted function in python to return the results of contours

# Get approximate contours:

for c in contours:

p = cv2.arcLength(c, True)

# Calculate the circumference of the closed contour or the length of the curve

approx = cv2.approxPolyDP(c, 0.02 * p, True)

# Specify (0.02 * p) as precision to approximate the polygon curve. Because approximate curve is a closed curve, the parameter closed is True.

if len(approx) == 4:

target = approx

break

#Find the rectangle profile we are looking for.

Step 4. Create a function to rectify and resize the target image.

ps: Function rectify is stored in resize.py.

def rectify(h):

h = h.reshape((4, 2))

hnew = np.zeros((4, 2), dtype=np.float32)

add = h.sum(1)

hnew[0] = h[np.argmin(add)] # return the larger number

hnew[2] = h[np.argmax(add)] diff = np.diff(h, axis=1)

# Calculate the N-dimensional discrete difference along the specified axis.

hnew[1] = h[np.argmin(diff)] hnew[3] = h[np.argmax(diff)] # Determine the four vertices of the detected document.

return hnew

approx = resize.rectify(target)

Step 5. Map our target to a quadrilateral size of (400 * 600) after perspective transformation.

pts2 = np.float32([[0, 0], [400, 0], [400, 600], [0, 600]])

M = cv2.getPerspectiveTransform(approx, pts2)

#Use the gtePerspectiveTransform function to obtain the perspective transformation matrix.

#(approx is the four fixed-point collection positions of the quadrilateral in the source image; pts2 is the four fixed-point collection positions of the

target image.)

dst = cv2.warpPerspective(orig, M, (400,600))

# Use the warpPerspective function to perform perspective transformation on the source image, the output image dst size is 400 * 600.

Step 6. Use several different ways to optimize the perspective transformed image to obtain the final result.

We can also compare different ways of processing below to choose the properest one to be our final results. The results of image processing

are not shown in the article. If you are interested in it, just try it by yourself.

dst = cv2.cvtColor(dst, cv2.COLOR_BGR2GRAY)

# Grayscale the image after perspective transformation

cv2.drawContours(image, [target], -1, (0, 255, 0), 2)

# Draw the outline, -1 means all the outlines, the color of the brush is green, and the thickness is 2.

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余5页未读，立即下载