February 22, 2009

Hibernate Search Example

In my previous blog I was writing about Hibernate Search in general, but in this blog I will show you code examples of using Hibernate Search.

The preferred and easiest way to get started is to use Maven2, if you are not able to use Maven2 then please download the relevant jar file.

<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/maven-v4_0_0.xsd">

<modelVersion>4.0.0</modelVersion>
<groupId>se.msc</groupId>
<artifactId>hibernatesearch</artifactId>
<version>0.0.1-SNAPSHOT</version>
<repositories>
<repository>
<id>repository.jboss.org</id>
<name>JBoss Maven Repository</name>
<url>http://repository.jboss.org/maven2</url>
<layout>default</layout>
</repository>
</repositories>
<dependencies>
<!-- For Test -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.5</version>
<scope>test</scope>
</dependency>
<!-- Hibernate Core -->
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-core</artifactId>
<version>3.3.1.GA</version>
</dependency>
<!-- Hibernate Annotation -->
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-annotations</artifactId>
<version>3.4.0.GA</version>
</dependency>
<!-- Hibernate Annotation uses SLF4J -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.5.2</version>
</dependency>
<!-- Hibernate EntityManager -->
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-entitymanager</artifactId>
<version>3.4.0.GA</version>
</dependency>
<!-- Hibernate Validator
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-validator</artifactId>
<version>3.0.0.ga</version>
</dependency> -->
<!-- Hibernate Search -->
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-search</artifactId>
<version>3.1.0.GA</version>
</dependency>
<!-- Hibernate Search 3part Lib -->
<!-- Solr's Analyzer Framework -->
<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr-common</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr-core</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-snowball</artifactId>
<version>2.4.0</version>
</dependency>
<!-- MySQL JDBC connector -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.6</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
In this example I will be using Hibernate Core, Hibernate Annotation and Hibernate Search. I will not use Hibernate EntityManager, but using EntityManager instead of Session is and easy thing to do and will not impact the concept or code significantly.

Lets start with configure our Hibernate Core with hibernate.cfg.xml.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE hibernate-configuration PUBLIC
"-//Hibernate/Hibernate Configuration DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">

<hibernate-configuration>
<session-factory>
<property name="hibernate.dialect">org.hibernate.dialect.MySQLInnoDBDialect</property>
<property name="hibernate.connection.driver_class">com.mysql.jdbc.Driver</property>
<property name="hibernate.connection.url">
jdbc:mysql://localhost:3306/test?createDatabaseIfNotExist=true&amp;useUnicode=true&amp;characterEncoding=utf-8
</property>
<property name="hibernate.connection.username">root</property>
<property name="hibernate.connection.password"></property>
<property name="hibernate.hbm2ddl.auto">create-drop</property> <!-- update -->

<!-- Hibernate Search -->
<!-- org.hibernate.search.store.FSDirectoryProvider -->
<!-- org.hibernate.search.store.RAMDirectoryProvider for test -->
<property name="hibernate.search.default.directory_provider">
org.hibernate.search.store.RAMDirectoryProvider
</property>
<property name="hibernate.search.default.indexBase">
/home/magnus/tmp/lucene/indexes
</property>

<!-- Mapped classes -->
<mapping class="se.msc.hibernatesearch.domain.Person" />
</session-factory>
</hibernate-configuration>
The Hibernate Search comes with sensible default values and there is actually only two values that needs configuring the directory provider and the base directory of the index files.

In this example I will be using JUnit as start class and since unit tested class are run over and over again I will for consistency use an in-memory index in combination with drop and create the database schema. This way I will always get a clean start whenever restarting the JUnit text.

To make our example complete here is the log4j.properties file.

# Root logger option
log4j.rootLogger=INFO, stdout

# Log native SQL
log4j.logger.org.hibernate.SQL=debug
log4j.logger.org.hibernate.bind=debug

# Direct log messages to stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{ABSOLUTE} %5p %c{1}:%L - %m%n
To be able to use Hibernate from JUnit we need the popular a HibernateUtil.

public class HibernateUtil {

private static final SessionFactory sessionFactory;

static {
try {
AnnotationConfiguration conf = new AnnotationConfiguration();
sessionFactory = conf.configure().buildSessionFactory();
} catch (Throwable ex) {
throw new ExceptionInInitializerError(ex);
}
}

public static Session getSession() throws HibernateException {
return sessionFactory.openSession();
}
}
Now lets start with annotating our domain class. For simplicity I will only use one class here.

@Entity
@Table(name = "PERSON")
@Indexed
public class Person implements Serializable {

private static final long serialVersionUID = 1L;

@Id
@Column(name = "PERSON_ID", updatable = false)
@GeneratedValue(strategy = GenerationType.IDENTITY)
@DocumentId
private Long id = null;

@Column(name = "FIRSTNAME", nullable = false, length = 250)
@Field(index = Index.TOKENIZED, store = Store.YES)
private String firstname = "";

@Column(name = "BIRTHDATE", nullable = false)
@Field(index = Index.UN_TOKENIZED, store = Store.YES)
@DateBridge(resolution = Resolution.DAY)
private Date birthdate = new Date();

public Person() {
}

public Person(String firstname, String birthdate)
throws IllegalArgumentException {
setFirstnameFromInput(firstname);
setBirthdateFromInput(birthdate);
}

public Long getId() {
return id;
}

protected void setId(Long id) {
this.id = id;
}

public String getFirstname() {
return firstname;
}

public void setFirstname(String firstname) {
this.firstname = StringUtil.setEmptyStringAsNull(firstname);
}

public void setFirstnameFromInput(String firstname) {
this.firstname = StringUtil.setEmptyStringAsNullAndTrim(firstname);
}

public Date getBirthdate() {
return birthdate;
}

public void setBirthdate(Date birthdate) {
this.birthdate = DateUtil.setTodayAsNull(birthdate);
}

public void setBirthdateFromInput(String birthdate)
throws IllegalArgumentException {
this.birthdate = DateUtil.setTodayAsNullAndParse(birthdate);
}

public String toString() {
return "Person {" + "id=" + id + ", firstname='" + firstname
+ "', birthdate='" + DateUtil.format(birthdate) + "'}";
}
}
There is not much to it. We use @Indexed to mark the class searchable, @DocumentId for primary key, @Field for simple properties and @DateBride for properties that need transformation, remember that Lucene only works with strings.

All Hibernate Search annotation are documented in org.hibernate.search.annotations

There is only two annotation that need further explanation:

index (Index.TOKENIZED | Index.UN_TOKENIZED)
Tokenized split the text into words (Tokens) and removes insignificant words. See example.


Untokenized leaves the text unchanged.

store (Store.YES | Store.NO)
Both options indexed the field, but Store.YES writes the field to Lucene index file and makes it available via Luke. But the main difference is that now one can utilize projection which means you can avoid even touching the database, that is the benefit we are looking for when writing high speed search application. The main drawback when using project is that raw Object, containing String value, are returned, instead of domain object graphs.

Using Store.YES should be the preferred way whenever you want high performance, and if you need to further manipulate the object, simple do a database round trip and grab the persisted domain object via the primary key.

Another drawback of projection is that you can only index simple properties and on-to-one (embedded) object, but not other many relations. This is due of difference in the object mode between Lucene and Hibernate.

public class HibernateTemplate {

public Object execute(HibernateCallback action) throws HibernateException {
Session session = null;
Transaction tx = null;
Object object = null;
try {
session = HibernateUtil.getSession();
tx = session.getTransaction();
tx.begin();
object = action.execute(session);
tx.commit();
} catch (HibernateException e) {
if (tx != null && tx.isActive())
tx.rollback();
throw e;
} finally {
if (session != null)
session.close();
}
return object;
}

public void saveOrUpdate(final Object entity) throws HibernateException {
execute(new HibernateCallback() {

@Override
public Object execute(Session session) throws HibernateException {
session.saveOrUpdate(entity);
return null;
}
});
}

public List find(final String query) throws HibernateException {
return (List) execute(new HibernateCallback() {

@Override
public Object execute(Session session) throws HibernateException {
return session.createQuery(query).list();
}
});
}

public List findWithFullText(String query, String field,
final Class entity) throws HibernateException, ParseException {

QueryParser parser = new QueryParser(field, new StandardAnalyzer());
final org.apache.lucene.search.Query lucQuery = parser.parse(query);

return (List) execute(new HibernateCallback() {

@Override
public Object execute(Session session) throws HibernateException {
FullTextSession ftSess = Search.getFullTextSession(session);
return ftSess.createFullTextQuery(lucQuery, entity).list();
}
});

}

public List findWithFullTextAndProjection(String query, String field,
final Class entity) throws HibernateException, ParseException {

QueryParser parser = new QueryParser(field, new StandardAnalyzer());
final org.apache.lucene.search.Query lucQuery = parser.parse(query);

return (List) execute(new HibernateCallback() {

@Override
public Object execute(Session session) throws HibernateException {
FullTextSession fTS = Search.getFullTextSession(session);
FullTextQuery fTQ = fTS.createFullTextQuery(lucQuery, entity);
fTQ.setProjection("id", "firstname", "birthdate");
return fTQ.list();
}
});

}
}

public interface HibernateCallback {

public Object execute(Session session) throws HibernateException;
}

public class SessionTest {

private static final Person[] INIT_DATA = new Person[] {
new Person("Magnus", "1974-01-01"),
new Person("Bertil", "1973-02-02"),
new Person("Klara", "1972-03-03") };

private static final String FIELD = "firstname";

private static final Class ENTITY = Person.class;

HibernateTemplate temp = new HibernateTemplate();

private void printPersonResult(Person[] persons) {
System.out.println("Number of hits: " + persons.length);
for (Person person : persons) {
System.out.println(person);
}
}

private void printObjectResult(List res) {
System.out.println("Number of hits: " + res.size());
for (Object row : res) {
Object[] objects = (Object[]) row;
for (Object o : objects)
System.out.println(o);
}
}

@Test
public void testSaveAll() throws Exception {
for (Person person : INIT_DATA)
temp.saveOrUpdate(person);
}

@Test
public void testFindAll() throws Exception {
List res = temp.find("from Person");
Assert.assertEquals("Testing find all.", 3, res.size());
printPersonResult(res.toArray(new Person[0]));
}

@Test
public void testFullText() throws Exception {
String query = "firstname:Magnus";
List res = temp.findWithFullText(query, FIELD, ENTITY);
Assert.assertEquals("Testing firstname search.", 1, res.size());
printPersonResult(res.toArray(new Person[0]));
}

@Test
public void testFullText2() throws Exception {
String query = "birthdate:19720303";
List res = temp.findWithFullText(query, FIELD, ENTITY);
Assert.assertEquals("Testing birthdate search.", 1, res.size());
printPersonResult(res.toArray(new Person[0]));
}

@Test
public void testFullTextProjection() throws Exception {
String query = "firstname:Magnus OR birthdate:19730202";
List res = temp.findWithFullTextAndProjection(query, FIELD, ENTITY);
Assert.assertEquals("Testing search projection.", 2, res.size());
printObjectResult(res);
}
}

7 comments:

Hardy Ferentschik said...
This comment has been removed by the author.
Hardy Ferentschik said...

Nice summary :) BTW, @DocumentId should not be required anymore.

Magnus K Karlsson said...

Thats true, but I kept it there for clarity, to point out which fields are stored in Lucene.

Rahi said...

It's nice example that one i am searchin from very long time.
Thank you Sir for this example.
-----Rahi Akela-----

Anonymous said...

Thank you for sharing such a nice example. I would like to know why the last 2 test cases (for date search) are failing for me. Has anyone faced the problem?

plasticmanplus said...

Hi..
I have a problem on StringUtil and DateUtil from which Class i should import it?
Thank You

plasticmanplus said...

Hi Magnus..
I'm a beginner in java, last comment i said that i was so confused about the DateUtil Class.
Well after it, i try to create a DateUtil class, and your code was work well.
Thank You for the share..

Tomi