What you need for this book 

Apex applications can be built and run locally on the user’s development machine via a properly written JUnit test. To do this, the user need only ensure that recent versions of the following software packages are present:

  • Java JDK (please note that the JRE alone is not adequate).
  • Maven build system
  • Git revision control system (optional)
  • A Java IDE such as Eclipse or IntelliJ (optional)

To run Apex applications on a cluster, one needs a cluster with Hadoop installed and a client to launch them. This client needs to be installed on the edge node (sometimes referred to as the gateway node or the client node); there is no need to install anything on the entire cluster.

There are several options to install the client, and some of them are listed on the Apex download page: http://apex.apache.org/downloads.html.

Without an existing Hadoop cluster, an easy way to get started for experimentation is a sandbox VM that already has a single node cluster configured (sandbox VMs are available from Hadoop vendors, as docker images and so on).