Ubuntu16.04 下编译安装tesseract 4.00.00alpha 及测试
约 255 字
预计阅读 1 分钟
- 3.05.01 及 以后的版本没有Linux的二进制包,需要编译安装.
1
2
3
4
5
6
7
8
9
10
11
12
|
# 安装相关组件
sudo apt-get install g++ # or clang++ (presumably)
sudo apt-get install autoconf automake libtool
sudo apt-get install autoconf-archive
sudo apt-get install pkg-config
sudo apt-get install libpng12-dev
sudo apt-get install libjpeg8-dev
sudo apt-get install libtiff5-dev
sudo apt-get install zlib1g-dev
sudo apt-get install libicu-dev
sudo apt-get install libpango1.0-dev
sudo apt-get install libcairo2-dev
|
- 依赖图像库Leptonica,在编译tesseract前先编译Leptonica, 版本对应关系见Compiling#linux,3.05对应leptonica-1.74.tar.gz
1
2
3
4
5
6
|
wget http://www.leptonica.org/source/leptonica-1.74.tar.gz
tar -xvf leptonica-1.74.tar.gz
cd leptonica-1.74
./configure
make
make install
|
- 编译安装tesseract 4.00.00alpha
1
2
3
4
5
6
|
git clone https://github.com/tesseract-ocr/tesseract.git
cd tesseract
./autogen.sh
./configure
make
make install
|
python3 调用
1
|
sudo pip3 install pytesseract
|
1
2
|
import pytesseract
vcode = pytesseract.image_to_string(im, lang='eng', config='-psm 12 --tessdata-dir /tessdata/')
|