Thursday, May 10, 2012

Getting started with Maven

What is Maven

Maven is quickly becoming, if it isn't already, the most popular build tool for java projects. With plugins you can do pretty much anything but at its heart Maven is a project management tool that handles dependency management, building and testing a java project. By default it can handle all the default java project types, jar, war, ejb-jar and ear.
The purpose of this article is to get familiar with the Maven setup so items like plugins and site development won't be covered here. There will be other articles that will cover those topics.

Application Setup

"Installing" Maven consists of downloading the package from the Maven site and unzipping the contents into a directory. If you're using an IDE such as Netbeans, Intellij or Eclipse you can usually skip this step as they already have an install built into the IDE. The important file for command line execution is the mvn.bat (windows) or mvn (linux) file in the bin/ directory. This file starts the maven process. Additional instructions and specialty builds of Maven are available on the Maven download page.

pom.xml

Aside from the application install the only other required file for maven is the pom.xml file. The pom file is the basis for a specific Maven project (such as a jar or a war). The pom file also contains information about other projects this project depends on (dependencies) the version to be built and deployment information.

Build information

At the start of the pom file is information about the name, version and type of the artifact to build. This information is summed up in four elements:
  1. groupId--this is like a package for your project, usually in the form com.<company name>.<group area> such as com.tigereye.examples
  2. artifactId--This is usually the name of the project
  3. version
  4. packaging--this tells Maven what type of artifact the project is. (Such as jar or war, it also tells maven how to package the artifact and any additional steps necessary for the specific project type)

Dependencies

While not strictly required for a project only the most basic projects don't depend on external sources. Whether it's hibernate for database access, the JEE API files for building an enterprise applications or an internal utility package pretty much all but the most basic applications require some kind of external resource to function.
Personally, this is where I believe Maven proves its worth. When setting up your project you only need to define programs your program directly depends on. Maven handles pulling in all the dependencies that project depends on that you don't so you don't have to worry about it. Let's say that you need to use hibernate but you don't specifically need to use antlr, you just have to add the dependency for hibernate and Maven will pick up antlr when you need to use hibernate (such as building a war or an ear). This way, you can focus on what your project needs and not worry about the rest of the jars necessary to actually run, Maven will ensure you have everything you need when it comes time to use them.
The dependency section starts with a dependencies element with one or more dependency elements. The content of each of the dependency element is as follows:
  1. groupId -- analogous to the groupId at the top of the file except this is the group id of the artifact you want to import
  2. artifactId -- like groupId, this is the artifact id of the artifact you want to use
  3. version -- the version of the artifact you want to use
  4. scope -- an optional attribute (defaults to compile) that has one of four values:
    • compile -- this package is necessary for compiling your code and all other steps
    • runtime -- this package isn't directly referenced in the code but needed to properly run
    • test -- this package is necessary for the testing tree but not necessary for running the main project (useful for packages like junit or testng)
    • provided -- this package is necessary to compile the project but will be provided for running the application later (such as servlet-api when writing a web app since it's provided by the web container)

Sample File

Below is a sample of a very basic pom file.
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.psc.example</groupId>
  <artifactId>jax-ws</artifactId>
  <packaging>war</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>jax-ws Maven Webapp</name>
  <dependencies>
    <dependency>
      <groupId>javax.persistence</groupId>
      <artifactId>jpa-api</artifactId>
      <version>2.1</version>
    </dependency>
    <dependency>
      <groupId>org.hibernate</groupId>
      <artifactId>hibernate-core</artifactId>
      <version>3.6.7.Final</version>
      <scope>runtime</scope>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>com.sun.faces</groupId>
      <artifactId>jsf-api</artifactId>
      <version>2.1.1-b02</version>
      <scope>provided</scope>
    </dependency>
  </dependencies>
</project>
The above sample is a basic web project. It uses JPA for its persistence which requires a compile time dependency with hibernate as the implementing persistence framework. Since we're using JPA in our code the hibernate dependency is specified as runtime (it won't appear on the compile classpath but it will be included in the WAR. We also have the JSF api package that is necessary for the presentation pages but that package is already a part of the container so we mark it as provided (it's used for compiling but won't be included in the WAR). Finally, for testing we need junit but we don't want it on the compile classpath or in the WAR so we mark it as test.

Project Structure

While you can override the folders for things like source if you want, unless there's a good reason it just makes good sense, not to mention a smaller pom file, to use the default directory layout for your project. The standard setup for files is as follows:
<project directory>
    src
        main
        java
        resources
        webapp (if your app is a web-app)
    test
        java
        resources
    pom.xml
If you aren't using your IDE to generate the project structure and you don't want to hand code the directory names you can execute the following maven script in the directory ABOVE the project directory:
mvn archetype:generate -DgroupId=com.mycompany.app -DartifactId=my-app -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
Substitute groupId and artifactId with appropriate names for your project. After it finishes executing a directory matching your artifactId.
Once you start compiling classes and running tests one more directory will be created by maven for storing the compiled classes called target. This directory is at the same level as src and will have sub directories under it for the compiled classes, web content and test resources, just like src has.

Execution

Once you have the rest of the maven setup complete you'll need to be able to perform actions on your project. In maven speak an action is called a phase. Maven has a very rich set of phases you can perform. In keeping with the theme of this article I'm only going to go over the basic ones here to get you up and running but the linked site has all of them and examples of what they do and how to use them correctly.
The main phases you'll be needed for day-to-day use are:
  • compile -- This compiles all source java files and copies any resources to the target directory
  • test -- runs all tests found in the system (if you add -DskipTests it will compile the tests but not run them or -Dmaven.test.skip=true to not even compile them. I'm sure I don't need to tell you how dangerous this can be)
  • package -- builds the output file (jar for standard projects, war, ear, ejb jars, etc. for other project types)
  • install -- copies the package to your local maven repository.
One thing to note, each phase builds on the previous one so if you execute the package phase it will execute compile, test and any other defined phase before the package phase. This becomes important as you move down the phase chain since a failure in a previous phase will halt the execution. 

Local Repository

Finally, while not something neophytes to maven will care about knowing about your local repository will help understand some features of maven. The idea of your local repository is a place on your machine where dependent packages (both for your projects and for maven itself) can be stored so yo don't have to go out to the internet each time you need to check something. In fact, the first time you run maven and, in general, the first time you use a new package you'll most likely notice a slowdown in the initial phases as maven downloads updated or new dependencies. Under your home directory (Windows: C:\Users\<username> or linux: /home/<username>) is a .m2 directory. Under there is the repository. It's essentially a network of first group ids, then artifact ids and finally version number directories where jars, poms and additional information for maven  are stored for use by both maven and you projects. As a general rule (even for more advanced things) you shouldn't need to mess with this much other than occasionally removing entries.

Conclusion

I can hear you now, "but what about the settings.xml file?", "what about plugins?", "what about deployments?" and any number of other things that maven can do. While I do agree that they're all important they really aren't necessary for getting started with maven and setting up an initial maven project. I'm sure I'll be covering them in additional posts but if you don't have a good foundation then the rest will fall apart.