Introduction To Java Persistence
This is the blog representation of the presentation I gave at CodepaLOUsa in March of 2011. The presentation included some basic introduction and history and some code examples which are shown below.
Along comes Enterprise Java Beans (EJB). EJB allowed us to map database information straight to Java objects. This was a revolution, but unfortunately it had drawbacks which limited it's adoption. Primarily, EJB was difficult to configure because of "XML Hell". Secondly, EJB did not allow us to define relationships between Entities.
To account for the need in the programming community to handle data in a more sane manner, many vendors start creating the predecessors to JPA. These implementations included products like Hibernate and TopLink. These tools handled mapping Plain Old Java Objects (POJOs) to database entities. They also allowed the objects to have relationships mapped between each other to handle table joins. The down side of these early implementations was that they were still difficult to configure and presented the programmer with more "XML Hell".
The Java community kept demanding better solutions for these problems, and the EJB3 specification finally introduced Java Persistence (JPA). JPA can be used without any XML configuration whatsoever, it uses Annotations to simplify defining entities, and finally you could use the annotations to create entity relationship mappings in a quick and efficient manner.
Bean - A Java object which has (typically) private fields/properties and uses getters and setters to manipulate/access those values.
Annotation - A method of adding configuration/compiler meta-data to a Class definition.
Entity - A special type of Bean which has either external (XML configuration) or internal (Annotations) metadata making it possible for JPA to manage the Bean.
POJO - A "Plain Old Java Object". A class which is not special in structure, content or compilation.
@Entity - This annotation tells the JPA provider that this class is an Entity and should be managed by the Persistence Context.
@Table - This annotation allows us to specify some additional details about the underlying table
@Id - An annotation used to indicate that the annotated field is to be used as the unique key on this entity. It is also possible to create an @EmbeddedId which indicates that the fields in the specified sub-class are to be used as a composite key
@GeneratedValue - This is how we can specify the method to be used when storing the primary key.
@Column - Optional annotation which can be used to more specifically control how a field is represented in the database tables.
When using JPA, a Persistence Context is created when you instantiate an EntityManagerFactory. When working within a container (JBoss/Spring), this could be done for you behind the scenes. There are two ways that a Persistence Context can be handled: Transactional and Resource Local. Transactional Persistence Contexts are typically used in containers where the server application manages the Persistence Context. Resource Local contexts mean that the programmer is responsible for starting and stopping transactions and managing the state of entities (attaching/detaching). Each option has it's place, depending on the application being developed. By default, JPA tends to use Resource Local Persistence Contexts when no other option is specified.
Once an EntityManagerFactory, and thus a Persistence Context, have been created; you can start creating and managing entities. This is done using an EntityManager instance. The EntityManager can be thought of as somewhat analogous to a Session (but much more) in JDBC terms. You can use the EntityManager to query for information from the database, and you can use it to persist new data into the database. Beyond the basic capabilities, an EntityManager keeps track of all managed objects and if they are changed within the Persistence Context it will persist those changes back to the storage system . . . Automatically! For example, if I use the EntityManager to pull a Person object from my database, and subsequently make changes to that object; as long as I am still within the boundaries of the Persistence Context, JPA will store those changes without having to call a method to trigger it.
JPA queries can me much simpler than their SQL counterparts. In most cases, to access list of results from a table, the programmer only has to write a short query like "From Person" and they will get all results from the table that the "Person" entity refers to. In addition, you can have a "WHERE" clause in the form of:
Person.java
Company.java
JPALogic.java
Let's analyze the class file listings above. The first one, Person.class, is very straightforward. It is a simple entity to represent a person. I used the "@Table" annotation to specify the name which should be used for the table in the database. I could also do other things in that annotation, such as specify unique constraints. So, little about that class needs to be explained. Do take note though that ALL entities must implement the Serializable interface in order to function properly.
The next class, Company, has some more interesting new annotations for us to understand. Most important is the "@OneToMany" annotation. This is used to specify that there is a relationship between this entity and the entity specified in the associated property, in this case an ArrayList of "Person" objects. Without any arguments, the @OneToMany annotation will define a foreign key field in the appropriate table and use the primary key of the related entity. So, in our case, a "company" field would be created in the "people" table, and it would also create a foreign key constraint that the "field" must reference a valid "index" in the "companies" table. You could also be more specific in the @OneToMany annotation so that you can better control how the relationship is created in the database schema:
This would cause the foreign key field in the "people" table to be named "company_id" instead of the default name. It also makes the relationship optional and nullable. When creating these sorts of relationship, the Hibernate implementation will also use as much information as it can to implement indexes constraints which will improve database responsiveness.
With the above classes, we implemented a uni-directional relationship. That means that you could grab an instance of a Company and use it to get the associated Person objects, but not vice-versa. In order to accomplish that, there is just one small change required.
Person.java
We add the @ManyToOne annotation to the Person class and tell the JPA implementation that it is already mapped by the "employees" property in the target class. It's just that simple!
Bird.class
Cat.class
In the Bird and Cat classes above, we have different properties of featherColor and furColor, but since they are compatible types, we can use the same column in the database and reduce the complexity. But, when we query using JPA and cast the result to one of these classes, the fields and accessors will be specific to that class.
Address.class
Company
As you can see in the above example code, we could embed an Address object into as many entities as we like and the appropriate fields would be added to your tables without any additional code.
It is often the case, especially with legacy database schemas, that we do not have a single primary key field to use for accessing records. In my experience, I have seen tables which use up to 8 fields to describe a unique record!! Don't worry, JPA can handle this with @EmbeddedId annotations. An embedded id is an @Embeddable class which is added as a property using @EmbbededId instead of @Embedded. That's the only difference between embedded classes and embedded Ids. An embedded Id is also re-usable, such that if you have a number of tables which use the same fields to achieve a unique key, you can just add the EmbeddedId class and move on... You could also use multiple @Id annotations to build a primary key, but I have generally found that to be far less readable than the @EmbeddedId syntax.
JPA Concepts
Schuchert's JPA Tutorial
The GlassFish Persistence FAQ
The Long Road To JPA
Java has long had database capabilities. We started with JDBC, which is excellent for working directly with the database. A nice thing to have if you are DBA; but as programmers, we need data . . . not databases. So, while JDBC would allow us to get data from the database, we had to manipulate the data to put it into usable objects and business logic code. This is tedious and time consuming.Along comes Enterprise Java Beans (EJB). EJB allowed us to map database information straight to Java objects. This was a revolution, but unfortunately it had drawbacks which limited it's adoption. Primarily, EJB was difficult to configure because of "XML Hell". Secondly, EJB did not allow us to define relationships between Entities.
To account for the need in the programming community to handle data in a more sane manner, many vendors start creating the predecessors to JPA. These implementations included products like Hibernate and TopLink. These tools handled mapping Plain Old Java Objects (POJOs) to database entities. They also allowed the objects to have relationships mapped between each other to handle table joins. The down side of these early implementations was that they were still difficult to configure and presented the programmer with more "XML Hell".
The Java community kept demanding better solutions for these problems, and the EJB3 specification finally introduced Java Persistence (JPA). JPA can be used without any XML configuration whatsoever, it uses Annotations to simplify defining entities, and finally you could use the annotations to create entity relationship mappings in a quick and efficient manner.
What Problems Can JPA Help Me To Solve?
Some of the challenges in writing applications around databases can be mitigated by using JPA. JPA automatically handles converting strings into safe values and thus prevent SQL injection attacks. JPA can also plug clusting, caching and connection pools plugged in without any changes to the business logic code. With the caching and the annotations, it speeds development time and runtime to improve your time to market for applications.Definitions
Bean - A Java object which has (typically) private fields/properties and uses getters and setters to manipulate/access those values.
Annotation - A method of adding configuration/compiler meta-data to a Class definition.
Entity - A special type of Bean which has either external (XML configuration) or internal (Annotations) metadata making it possible for JPA to manage the Bean.
POJO - A "Plain Old Java Object". A class which is not special in structure, content or compilation.
What Does JPA Do?
There are a few things that JPA does to make the life of the progammer much simpler. Primarily, it automates many of the traditional tasks involved storing and retrieving information from a database. Secondly, it can provide improvements to how data is accessed in the form of Caching, connection pooling and abstraction of the underlying SQL queries so that the code is database agnostic without any specialized coding. Additionally, JPA can be used as a way to make data more mobile between disparate machines and still tied to a database through the use of the "detached" entity concept, thus allowing a record to manipulated and passed around between systems and still persisted to the underlying database.The Basics Of JPA Entities
What is an Entity? Well, in it's simplest form an Entity is just a Bean class with some annotations added. An Entity is a Bean which has private properties and public getter/setter methods and a few small annotations. Please see an example of a basic Entity below.@Entity @Table(name=”mytable”) public class MyEntity implements Serializable { private static final long serialVersionUID = 1L; @Id @GeneratedValue(strategy=GenerationType.IDENTITY) @Column(unique=true, nullable=false) private int index = 0 ; @Column(name=”mydatacolumn”) private String myDataColumn = null ; // Getters &Setters omitted . . . }
@Entity - This annotation tells the JPA provider that this class is an Entity and should be managed by the Persistence Context.
@Table - This annotation allows us to specify some additional details about the underlying table
@Id - An annotation used to indicate that the annotated field is to be used as the unique key on this entity. It is also possible to create an @EmbeddedId which indicates that the fields in the specified sub-class are to be used as a composite key
@GeneratedValue - This is how we can specify the method to be used when storing the primary key.
@Column - Optional annotation which can be used to more specifically control how a field is represented in the database tables.
Configuring JPA
Configuration of JPA is much simpler than it's progenitors and more flexible in implementation options. You can use the persistence.xml file, a properties file, or configure the settings within the program. Each of these options has their merits and limitations, so it's up to the individual to determine what works best in your projects. In addition to these methods, many containers (like Spring and JBoss) can provide configurations to their applications without additional configuration at all.The Persistence Context
An oft misunderstood concept; the Persistence Context, at it's simplest level, is a cache layer between your application and the underlying database. So, when you manage an object via JPA, and modify it; it will not necessarily be immediately written to the database. Conversely, when you read from a managed object, it may already be held in memory and thus no request to the database is needed. Overall, this provides significant performance improvements for database intensive applications.When using JPA, a Persistence Context is created when you instantiate an EntityManagerFactory. When working within a container (JBoss/Spring), this could be done for you behind the scenes. There are two ways that a Persistence Context can be handled: Transactional and Resource Local. Transactional Persistence Contexts are typically used in containers where the server application manages the Persistence Context. Resource Local contexts mean that the programmer is responsible for starting and stopping transactions and managing the state of entities (attaching/detaching). Each option has it's place, depending on the application being developed. By default, JPA tends to use Resource Local Persistence Contexts when no other option is specified.
Once an EntityManagerFactory, and thus a Persistence Context, have been created; you can start creating and managing entities. This is done using an EntityManager instance. The EntityManager can be thought of as somewhat analogous to a Session (but much more) in JDBC terms. You can use the EntityManager to query for information from the database, and you can use it to persist new data into the database. Beyond the basic capabilities, an EntityManager keeps track of all managed objects and if they are changed within the Persistence Context it will persist those changes back to the storage system . . . Automatically! For example, if I use the EntityManager to pull a Person object from my database, and subsequently make changes to that object; as long as I am still within the boundaries of the Persistence Context, JPA will store those changes without having to call a method to trigger it.
JPA-QL, The JPA Query Language
If you have had to use multiple database engines from time to time, then you are aware that not all SQL implementations are created equal. In many cases, this meant having to rewrite large amounts of code in order to move from one database server to another. With JPA-QL, that is no longer the case. JPA-QL is an abstracted layer over SQL. It still can have many of the complexities and flexibility of SQL, but it is platform agnostic; instead allowing the JPA implementation to use the correct SQL dialect under the hood. This allows and application to be written once and by merely changing some of the JPA configuration options we can switch from one database platform to another.JPA queries can me much simpler than their SQL counterparts. In most cases, to access list of results from a table, the programmer only has to write a short query like "From Person" and they will get all results from the table that the "Person" entity refers to. In addition, you can have a "WHERE" clause in the form of:
EntityManager mgr = myEntityManagerFactory.createEntityManager() ; mgr.getTransaction().begin() ; ArrayListpeopleNamedDeven = mgr.createQuery("From Person where forename=:foreName").addProperty("foreName","Deven").getResultList() ; // Do something with these objects mgr.getTransaction().commit() ;
Entity Relationships
Database joins across multiple tables are a special case, and are handled in a completely intuitive manner within JPA. When you create an entity, you can use annotations to indicate that one entity is related to another entity. Once these Object Relational Mappings (ORM) are in place, referring to the associated getters is all it takes to perform a join and get the results. Here's an example:Person.java
import java.io.Serializable; import javax.persistence.*; @Entity @Table(name="people") public class Person implements Serializable { private static final long serialVersionUID = 1L; @Id @GeneratedValue(strategy=GenerationType.IDENTITY) private int index = 0 ; private String name = null ; private int age = 0 ; // Getters and Setters omitted . . . }
Company.java
import java.io.Serializable; import javax.persistence.*; @Entity @Table(name="companies") public class Company implements Serializable { private static final long serialVersionUID = 1L; @Id @GeneratedValue(strategy=GenerationType.IDENTITY) private int index = 0 ; private String name = null ; @OneToMany private ArrayList<Person> employees = null ; // Getters and Setters omitted . . . }
JPALogic.java
public class JPALogic { public void manipulateRelatedObjects() { // Get an EntityManager instance from the Persistence Context EntityManager mgr = Persistence.createEntityManagerFactory("JPAExample").createEntityManager() ; // Start a new transaction block mgr.getTransaction().begin() ; // Use a JPA-QL query to fetch an instace of the Company class Company aCompany = mgr.createQuery("From Company where index=:pk").addParameter("pk",1).getSingleResult() ; // Use the getter from Company to retrieve a list of employees which are related to this company, and then grab the first item in the list Person anEmployee = aCompany.getEmployees().get(0) ; // Do something with aCompany and/or anEmployee // Commit the transaction mgr.getTransaction().commit() ; } }
Let's analyze the class file listings above. The first one, Person.class, is very straightforward. It is a simple entity to represent a person. I used the "@Table" annotation to specify the name which should be used for the table in the database. I could also do other things in that annotation, such as specify unique constraints. So, little about that class needs to be explained. Do take note though that ALL entities must implement the Serializable interface in order to function properly.
The next class, Company, has some more interesting new annotations for us to understand. Most important is the "@OneToMany" annotation. This is used to specify that there is a relationship between this entity and the entity specified in the associated property, in this case an ArrayList of "Person" objects. Without any arguments, the @OneToMany annotation will define a foreign key field in the appropriate table and use the primary key of the related entity. So, in our case, a "company" field would be created in the "people" table, and it would also create a foreign key constraint that the "field" must reference a valid "index" in the "companies" table. You could also be more specific in the @OneToMany annotation so that you can better control how the relationship is created in the database schema:
@OneToMany(nullable=true,optional=true) @JoinColumn(name="index",referencedColumn="company_id") private ArrayList<Person> people = null ;
This would cause the foreign key field in the "people" table to be named "company_id" instead of the default name. It also makes the relationship optional and nullable. When creating these sorts of relationship, the Hibernate implementation will also use as much information as it can to implement indexes constraints which will improve database responsiveness.
With the above classes, we implemented a uni-directional relationship. That means that you could grab an instance of a Company and use it to get the associated Person objects, but not vice-versa. In order to accomplish that, there is just one small change required.
Person.java
import java.io.Serializable; import javax.persistence.*; @Entity @Table(name="people") public class Person implements Serializable { private static final long serialVersionUID = 1L; @Id @GeneratedValue(strategy=GenerationType.IDENTITY) private int index = 0 ; private String name = null ; private int age = 0 ; @ManyToOne(mappedBy="employees", optional=true, nullable=true) private Company employer = null ; // Getters and Setters omitted . . . }
We add the @ManyToOne annotation to the Person class and tell the JPA implementation that it is already mapped by the "employees" property in the target class. It's just that simple!
Inheritance And Polymorphism In JPA
As object oriented programmers, we are all familiar with the concepts of inheritance and polymorphism. JPA includes these object oriented concepts in a very intuitive manner. An example would be keeping a table of animals in your database. Many animals share common traits, but some have unique attributes as well. So, we can define an entity called "Animal" which contains the common attributes of all animals. Once that entity is created, we could then inherit that class into more specific entities like Cat or Dog or Bird, and then put specific attributes about those species into the subclass. By default, when you do this, JPA will create a single table for all animals and columns to store attributes for all inheriting classes. You can override this behavior by using @Column annotations so that compatible fields would use the same columns:Bird.class
@Entity public class Bird extends Animal implements Serializable { private static final long serialVersionUID = 1L; @Column(name="color") private String featherColor = null ; // Getters and Setters omitted . . . }
Cat.class
@Entity public class Bird extends Animal implements Serializable { private static final long serialVersionUID = 1L; @Column(name="color") private String furColor = null ; // Getters and Setters omitted . . . }
In the Bird and Cat classes above, we have different properties of featherColor and furColor, but since they are compatible types, we can use the same column in the database and reduce the complexity. But, when we query using JPA and cast the result to one of these classes, the fields and accessors will be specific to that class.
Embedded Classes and Composite Keys
JPA would not be the amazing leap ahead that it is without some of the innovations that it provides for reducing and reusing code. One such feature is the concept of "Embedded" classes and keys. Instead of creating an series of fields to store addresses for various types of entities (Companies, People, and Customers all have addresses, right?), we can instead create an @Embedded class to make those address fields available to multiple entities without having to rewrite the same boilerplate code. See the examples below:Address.class
@Embeddable public class Address implements Serializable { private static final long serialVersionUID = 1L; private String streetAddr1 = null ; private String streetAddr2 = null ; private String city = null ; @Column(length=2) private String state = null ; @Column(length=24) private String postalCode = null ; }
Company
@Entity public class Company implements Serializable { private static final long serialVersionUID = 1L; @Id private int index = 0 ; @Embedded private Address address = null ; // Getters and Setters omitted . . . }
As you can see in the above example code, we could embed an Address object into as many entities as we like and the appropriate fields would be added to your tables without any additional code.
It is often the case, especially with legacy database schemas, that we do not have a single primary key field to use for accessing records. In my experience, I have seen tables which use up to 8 fields to describe a unique record!! Don't worry, JPA can handle this with @EmbeddedId annotations. An embedded id is an @Embeddable class which is added as a property using @EmbbededId instead of @Embedded. That's the only difference between embedded classes and embedded Ids. An embedded Id is also re-usable, such that if you have a number of tables which use the same fields to achieve a unique key, you can just add the EmbeddedId class and move on... You could also use multiple @Id annotations to build a primary key, but I have generally found that to be far less readable than the @EmbeddedId syntax.
Conclusion
Hopefully you have found this basic introduction helpful, and hopefully JPA can assist you in being more productive with less effort as all programmers are want to be. If you have any questions, please post them and I will answer as I am able. Addtionally, see below for the links to other references on JPA.JPA Concepts
Schuchert's JPA Tutorial
The GlassFish Persistence FAQ
Comments
Come Back To Hawaii , where you are neeeded
Jeff@personaltouchcomputers.com