Ubuntu16.04 下编译安装tesseract 4.00.00alpha 及测试
   约 255 字 
 预计阅读 1 分钟 
 
- 3.05.01 及 以后的版本没有Linux的二进制包,需要编译安装.
|  1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
 | # 安装相关组件
sudo apt-get install g++ # or clang++ (presumably)
sudo apt-get install autoconf automake libtool
sudo apt-get install autoconf-archive
sudo apt-get install pkg-config
sudo apt-get install libpng12-dev
sudo apt-get install libjpeg8-dev
sudo apt-get install libtiff5-dev
sudo apt-get install zlib1g-dev
sudo apt-get install libicu-dev
sudo apt-get install libpango1.0-dev
sudo apt-get install libcairo2-dev
 | 
 
- 依赖图像库Leptonica,在编译tesseract前先编译Leptonica, 版本对应关系见Compiling#linux,3.05对应leptonica-1.74.tar.gz
| 1
2
3
4
5
6
 | wget http://www.leptonica.org/source/leptonica-1.74.tar.gz
tar -xvf leptonica-1.74.tar.gz
cd leptonica-1.74
./configure 
make
make install
 | 
 
- 编译安装tesseract 4.00.00alpha
| 1
2
3
4
5
6
 | git clone https://github.com/tesseract-ocr/tesseract.git
cd tesseract
./autogen.sh
./configure 
make
make install
 | 
 
python3 调用
| 1
 | sudo pip3 install pytesseract
 | 
 
| 1
2
 | import pytesseract
vcode = pytesseract.image_to_string(im, lang='eng', config='-psm 12 --tessdata-dir /tessdata/')
 |