Application configuration

In the previous sections, we have seen how applications can be specified and how custom operators can be developed (with an example for configurable property). Most operators have properties that need to be configured, for example, a file reader will need to be supplied with the directory path or a Kafka consumer the broker address and topic. Whoever deploys the application needs to know and be able to supply values for these properties.

In addition to properties that are directly related to the functionality of an operator, there is another category of settings called attributes that control behavior of the platform (as opposed to the functionality of operators).

Attributes are defined for three different scopes:

Since settings can vary between environments or refer to security sensitive information like credentials, they should not be embedded into the application code, but defined externally and provided when the application is launched. Source and format of the configuration depends on the tool that is used the launch the application and the following example will apply to the Apex command line interface (CLI), that expects files in Hadoop configuration file format:

<property> 
  <name>apex.application.MyFirstApplication.operator.input.prop.directory</name> 
  <value>./src/test/resources</value> 
</property> 
<property> 
  <name>apex.application.MyFirstApplication.operator.output.prop.filePath</name> 
  <value>./target</value> 
</property> 
<property> 
  <name>apex.application.MyFirstApplication.operator.output.prop
.outputFileName</name> <value>wordcountresult</value> </property> <property> <name>apex.application.MyFirstApplication.operator.output.prop.maxLength</name> <value>500</value> </property> <property> <name>apex.application.MyFirstApplication.operator.output.prop
.alwaysWriteToTmp</name> <value>false</value> </property>

The configuration block shows how the operator properties of input file reader and output file writer of our word count application are configured for execution inside the project directory on the local machine.

Inpidual properties match the operator's getters and setters per Java bean convention. The extra application prefix allows multiple DAGs to be configured with a single file, separated by application name (MyFirstApplication was the name annotated to the application class). Operator names like output match the names that were used in the addOperator() calls.

Examples for attributes, streams, and other details on the available settings and ways to specify them can be found at http://apex.apache.org/docs/apex/application_packages/#application-configuration.