Interfacing c/c++ libraries via JNI, example: tesseract
(Android phone – how-to/example)
If you want to use c/c++ libraries with your Android Java code, you will need to learn a bit about Android 1.5NDK, Release 1.
First you need to have installed both the Android SDK and the NDK, for these you will need to follow the install steps described on the official android-developers website. Note that the NDK contains just some of the default C libraries:
- standard C library: stdlib, stdio etc
- math library
- c++ library: cstddef, new, utility, stl_pair.h
- log library
- zlib compression library
Thusporting “any” c/c++ code to Android may not be such an easy task.
After you install the NDK there are 2 demos in your install directory: hello-jni and two-libs. One should first understand these two basic-demos before moving forward. Note: the documentation is also a must read, it resides in /docs, especially ANDROID-MK.TXT, APPLICATION-MK.TXT, OVERVIEW.TXT
To create your own code, do the following:
1. Place your c/c++ sources under sources/
2. Write sources/
The Android.mk is <
So for developers familiar with Makefile format, this should be a piece of cake.
For all available flags consult docs/ANDROID-MK.TXT
3. Write apps/ First line instructs where to find the java source files, second line instructs what library (libraries) are needed for the application to run. 4. Build your native code by running “make APP= To get more info about the build process use “V=1″ Observation: a useful trick to know, if you do not want to keep separated your C files from your jave files, according to the NDK preferences, you can use the the “ln” command to make a symbolic link of your directory. Syntax: ln -s And now for the example consider the Google’s text-recognition project Tesseract, you can download the sources from here. Also running the Tesseract on the Android’s processor may take some time, so it would be cool to split the OCR process in steps, and run just some of the steps if more speed is needed. Tesseract already provides the means to do this, unfortunately, the methods are present only in the “core” and not in the JNI interface, nor close to it. To be able to do the OCR step-by-step consider the jni.cpp updated file in our code repository. For jni.c to work, you will also need to update the baseapi.cpp file. All said about the Tesseract modifications, you will just need to build your native code (see step 4). Than you need to integrate it with your Java code. The tesseract java-wrapper is encapsulated in OCR.java. All native method declarations are at the end of the class: // general initialization/cleanup/setup functions public native void cleanupNativeDataNative(); // delete the lib buffers public native void setVariableNative( // set a lib variable // language functions // aux functions before OCR // OCR // aux functions, to be run after OCR // debug functions public int mNativeData; // storage space for the library’s internal buffers /* this is used to load the ‘ocr’ library on application Code example: you may find useful to see the whole implementation of this in the Mezzofanti application – google code. The code is released under Apache License, ver 2.0, so you can freely use it free of charge in your own code.
A trivial Application.mk file would be:
APP_MODULES :=
APP_PROJECT_PATH :=
For all available flags consult docs/APPLICATION-MK.TXT
To force rebuild all your sources use “-B”
Now you have to compile the code (as described in the steps above). The developers have already included an Android.mk, but you will need to rewrite it a bit, in order to compile with the NDK. The modified version of the Android.mk is here.
public native void classInitNative(); // init the lib (1st to be called at startup)
public native void initializeNativeDataNative(); // init allocate lib buffers (2nd to be called)
public native boolean openNative(String sLanguage); // init the api with a language
public native void clearResultsNative(); // api clear
public native void closeNative(); // api.end, called by the destructor automatically
String var, String value);
public static native String[] getLanguagesNative(); // get the language list
public native int getShardsNative( // get the shard of the language
String lang);
public native boolean isValidWord(String word); // is the word valid according to the installed language
public native void setImageNative( // copy the image to the internal api buffers
byte[] image, int width, int height, int bpp);
public native void setImageNative( // copy the image to the internal api buffers
int[] image, int width, int height,
boolean bBWFilter, boolean bHorizDisp);
public native void releaseImageNative(); // release the internal api buffers
public native void setRectangleNative( // set the rectangle where OCR will focus
int left, int top, int width, int height);
public native void setPageSegModeNative( // set the page segmentation mode
int mode);
public native String recognizeNative( // do OCR over the parameter image
byte[] image, int width, int height, int bpp);
public native String recognizeNative(); // do OCR over the image in the api buffers (all ocr)
public native String recognizeNative(int nopass); // do OCR over the image in the api buffers
// (options: 0=all, 1 or 2 passes)
public native int meanConfidenceNative(); // mean confidence (last OCR)
public native int[] wordConfidencesNative(); // confidence for each word (last OCR)
public native String getBoxText(); // get the box for each letter
public native void closeDebug(); // clean close the debug (if any)
public native String libVer(); // get the lib version
* startup. The library has already been unpacked into
* /data/data/com…/lib/libocr.so at installation time by the package manager.
*/
static
{
System.loadLibrary(”ocr”);
}













Hi,
First of all, thanks for posting your efforts tyring (and succeeding) to compile tesseract with the ndk. I’m trying to do the same thing but using Tesseract 2.04, the thing is that after compiling each cpp individually make crashes when trying to link everything into the .so lib. I’m receiving lot’s of “undefined reference to” errors related to functions and variables that do exist, so I’m asuming that it’s my fault. Any Idea of what mitght be?.
Thanks in advance.
Hi Xabier,
proably there’s some library the you also have to link.
Check each “undef ref” and try to figure out where you can find the appropriate functions (eventually google them).
Best regards.
Great write up. I won’t have time to try it for a few weeks, but this is great material.
Xabier, please update here, or link to your work, when you have a chance.
Thank you both.
D