OpenIMAJ is an award-winning set of libraries and tools for multimedia (images, text, video, audio, etc.) content analysis and content generation. OpenIMAJ is very broad and contains everything from state-of-the-art computer vision (e.g. SIFT descriptors, salient region detection, face detection, etc.) and advanced data clustering, through to software that performs analysis on the content, layout and structure of webpages.
Please see the tutorial: http://openimaj.org/tutorial
Instructions for source compilation can be found here: https://github.com/openimaj/openimaj/wiki/Compiling-OpenIMAJ-from-Source Please note that you probably don't want to do this unless you intend on adding new features or making changes to the source code; precompiled jars of release versions are available from Maven Central, and regular snapshots (built automatically on when code is pushed to the github repository) are available from http://snapshots.openimaj.org