Archive

Posts Tagged ‘tesseract’

Mezzofanti – Install&Run Tutorial

August 20th, 2009 4 comments

(Android phone – application, how-to)

This tutorial shows you how to run the Mezzofanti application from your Android SDK:
1. Download and install Android SDK
2. Download and install Android NDK
(optional) read our post on how to interface C/C++ with Java, with example on Tesseract, here.
3. Download Mezzofanti source code and patches

Note: from now on we will consider the ndk installed in [ndk_path]

4. Create directory “tesseract” in [ndk_path]/sources
5. Copy Tesseract 2.03 in [ndk_path]/sources/tesseract. Copy them directly, for example “Makefile” should be seen directly in [ndk_path]/sources/tesseract.

6. Unzip all downloaded files (step 3), in the directory [ndk_path]/sources/tesseract
7. In [ndk_path]/apps make a symbolic link to “mezzofanti_java_code”
After this step, the directory /sources/tesseract should look like:

-rw-r–r– 1 User None 170 Jul 8 14:38 AUTHORS
-rwxr-xr-x 1 User None 12916 Jul 8 14:38 Android-multiple-libs.mk
-rwxr-xr-x 1 User None 5997 Jul 9 17:40 Android.mk
-rwxr-xr-x 1 User None 71 Jul 9 15:35 Application.mk
-rw-r–r– 1 User None 17751 Jul 8 14:38 BUILD
-rw-r–r– 1 User None 1058 Jul 8 14:38 COPYING
-rw-r–r– 1 User None 3014 Jul 8 14:38 ChangeLog
-rw-r–r– 1 User None 9236 Jul 8 14:38 INSTALL
-rw-r–r– 1 User None 1223 Jul 8 14:38 LICENSE
-rw-r–r– 1 User None 933 Jul 8 14:38 Makefile.am
-rw-r–r– 1 User None 45 Jul 8 14:38 NEWS
-rw-r–r– 1 User None 33 Jul 8 14:38 OWNERS
-rw-r–r– 1 User None 5014 Jul 8 14:38 README
-rw-r–r– 1 User None 8913 Jul 8 14:38 ReleaseNotes
-rw-r–r– 1 User None 286 Jul 8 14:38 StdAfx.cpp
-rw-r–r– 1 User None 778 Jul 8 14:38 StdAfx.h
-rw-r–r– 1 User None 7309 Jul 8 14:38 acinclude.m4
-rw-r–r– 1 User None 33298 Jul 8 14:38 aclocal.m4
drwxr-xr-x+ 2 User None 0 Jul 8 14:38 aspirin
drwxr-xr-x+ 2 User None 0 Aug 31 18:35 ccmain
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 ccstruct
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 ccutil
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 classify
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 config
-rwxr-xr-x 1 User None 10209 Jul 8 14:38 configure.ac
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 cutil
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 dict
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 dlltest
drwxr-xr-x+ 2 User None 0 Jul 8 14:38 doc
drwxr-xr-x+ 2 User None 0 Jul 8 14:38 helium
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 image
drwx——+ 3 User None 0 Aug 28 10:28 java
drwx——+ 9 User None 0 Sep 1 17:22 mezzofanti_java_code
-rwx——+ 1 User None 42136 May 29 21:54 liblog.so
-rwxr-xr-x 1 User None 437 Jul 8 14:38 makemoredists
drwxr-xr-x+ 3 User None 0 Jul 8 14:38 neural_networks
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 pageseg
-rwxr-xr-x 1 User None 2140 Jul 8 14:38 runautoconf
drwxr-xr-x+ 6 User None 0 Aug 31 18:39 tessdata
-rw-r–r– 1 User None 9200 Jul 8 14:38 tessdll.cpp
-rw-r–r– 1 User None 38429 Jul 8 14:38 tessdll.dsp
-rw-r–r– 1 User None 5421 Jul 8 14:38 tessdll.h
-rwxr-xr-x 1 User None 159480 Jul 8 14:38 tessdll.vcproj
-rwxr-xr-x 1 User None 38795 Jul 8 14:38 tesseract.dsp
-rwxr-xr-x 1 User None 2117 Jul 8 14:38 tesseract.dsw
-rwxr-xr-x 1 User None 4709 Jul 8 14:38 tesseract.sln
-rw-r–r– 1 User None 5905 Jul 8 14:38 tesseract.spec
-rwxr-xr-x 1 User None 125592 Jul 8 14:38 tesseract.vcproj
-rw-r–r– 1 User None 560 Sep 1 17:36 test
drwxr-xr-x+ 3 User None 0 Aug 28 10:28 testing
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 textord
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 training
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 viewer
drwxr-xr-x+ 2 User None 0 Aug 28 10:28 wordrec

8. Patch the Tesseract files, with this Tesseract_ccmain_patch.zip (found on the google code). Also do copy over the Android.mk file, over the original one in order to be able to compile the code.

9. In Eclipse, import the Mezzofanti project.

enjoy ;)

Interfacing c/c++ libraries via JNI, example: tesseract

April 26th, 2009 3 comments

(Android phone – how-to/example)

If you want to use c/c++ libraries with your Android Java code, you will need to learn a bit about Android 1.5NDK, Release 1.

First you need to have installed both the Android SDK and the NDK, for these you will need to follow the install steps described on the official android-developers website. Note that the NDK contains just some of the default C libraries:
- standard C library: stdlib, stdio etc
- math library
- c++ library: cstddef, new, utility, stl_pair.h
- log library
- zlib compression library
Thusporting “any” c/c++ code to Android may not be such an easy task.

After you install the NDK there are 2 demos in your install directory: hello-jni and two-libs. One should first understand these two basic-demos before moving forward. Note: the documentation is also a must read, it resides in /docs, especially ANDROID-MK.TXT, APPLICATION-MK.TXT, OVERVIEW.TXT

To create your own code, do the following:
1. Place your c/c++ sources under sources//…

2. Write sources//Android.mk to describe your sources to the NDK build system
The Android.mk is <>
So for developers familiar with Makefile format, this should be a piece of cake.
For all available flags consult docs/ANDROID-MK.TXT

3. Write apps//Application.mk to describe your application and the native sources it needs to the NDK build system. The purpose of Application.mk is to describe which native ‘modules’ (i.e. static/shared libraries) are needed by your application.
A trivial Application.mk file would be:
APP_MODULES := APP_PROJECT_PATH :=

First line instructs where to find the java source files, second line instructs what library (libraries) are needed for the application to run.
For all available flags consult docs/APPLICATION-MK.TXT

4. Build your native code by running “make APP=” in the top-level NDK directory. Where ‘make’ refers to GNU Make, and is the name of one of the subdirectories of ‘$NDK/apps/’

To get more info about the build process use “V=1″
To force rebuild all your sources use “-B”

Observation: a useful trick to know, if you do not want to keep separated your C files from your jave files, according to the NDK preferences, you can use the the “ln” command to make a symbolic link of your directory. Syntax: ln -s

And now for the example consider the Google’s text-recognition project Tesseract, you can download the sources from here.
Now you have to compile the code (as described in the steps above). The developers have already included an Android.mk, but you will need to rewrite it a bit, in order to compile with the NDK. The modified version of the Android.mk is here.

Also running the Tesseract on the Android’s processor may take some time, so it would be cool to split the OCR process in steps, and run just some of the steps if more speed is needed. Tesseract already provides the means to do this, unfortunately, the methods are present only in the “core” and not in the JNI interface, nor close to it. To be able to do the OCR step-by-step consider the jni.cpp updated file in our code repository. For jni.c to work, you will also need to update the baseapi.cpp file.

All said about the Tesseract modifications, you will just need to build your native code (see step 4). Than you need to integrate it with your Java code.

The tesseract java-wrapper is encapsulated in OCR.java. All native method declarations are at the end of the class:

// general initialization/cleanup/setup functions
public native void classInitNative(); // init the lib (1st to be called at startup)
public native void initializeNativeDataNative(); // init allocate lib buffers (2nd to be called)
public native boolean openNative(String sLanguage); // init the api with a language

public native void cleanupNativeDataNative(); // delete the lib buffers
public native void clearResultsNative(); // api clear
public native void closeNative(); // api.end, called by the destructor automatically

public native void setVariableNative( // set a lib variable
String var, String value);

// language functions
public static native String[] getLanguagesNative(); // get the language list
public native int getShardsNative( // get the shard of the language
String lang);
public native boolean isValidWord(String word); // is the word valid according to the installed language

// aux functions before OCR
public native void setImageNative( // copy the image to the internal api buffers
byte[] image, int width, int height, int bpp);
public native void setImageNative( // copy the image to the internal api buffers
int[] image, int width, int height,
boolean bBWFilter, boolean bHorizDisp);
public native void releaseImageNative(); // release the internal api buffers
public native void setRectangleNative( // set the rectangle where OCR will focus
int left, int top, int width, int height);
public native void setPageSegModeNative( // set the page segmentation mode
int mode);

// OCR
public native String recognizeNative( // do OCR over the parameter image
byte[] image, int width, int height, int bpp);
public native String recognizeNative(); // do OCR over the image in the api buffers (all ocr)
public native String recognizeNative(int nopass); // do OCR over the image in the api buffers
// (options: 0=all, 1 or 2 passes)

// aux functions, to be run after OCR
public native int meanConfidenceNative(); // mean confidence (last OCR)
public native int[] wordConfidencesNative(); // confidence for each word (last OCR)
public native String getBoxText(); // get the box for each letter

// debug functions
public native void closeDebug(); // clean close the debug (if any)
public native String libVer(); // get the lib version

public int mNativeData; // storage space for the library’s internal buffers

/* this is used to load the ‘ocr’ library on application
* startup. The library has already been unpacked into
* /data/data/com…/lib/libocr.so at installation time by the package manager.
*/
static
{
System.loadLibrary(”ocr”);
}

Code example: you may find useful to see the whole implementation of this in the Mezzofanti application – google code. The code is released under Apache License, ver 2.0, so you can freely use it free of charge in your own code.

Categories: android c/c++ Tags: ,