Batch Processing

This chapter describes Jakarta Batch, which provides support for defining, implementing, and running batch jobs. Batch jobs are tasks that can be executed without user interaction. The batch framework is composed of a job specification language based on XML, a Java API, and a batch runtime.

Introduction to Batch Processing

Some enterprise applications contain tasks that can be executed without user interaction. These tasks are executed periodically or when resource usage is low, and they often process large amounts of information such as log files, database records, or images. Examples include billing, report generation, data format conversion, and image processing. These tasks are called batch jobs.

Batch processing refers to running batch jobs on a computer system. Jakarta EE includes a batch processing framework that provides the batch execution infrastructure common to all batch applications, enabling developers to concentrate on the business logic of their batch applications. The batch framework consists of a job specification language based on XML, a set of batch annotations and interfaces for application classes that implement the business logic, a batch container that manages the execution of batch jobs, and supporting classes and interfaces to interact with the batch container.

A batch job can be completed without user intervention. For example, consider a telephone billing application that reads phone call records from the enterprise information systems and generates a monthly bill for each account. Since this application does not require any user interaction, it can run as a batch job.

The phone billing application consists of two phases: The first phase associates each call from the registry with a monthly bill, and the second phase calculates the tax and total amount due for each bill. Each of these phases is a step of the batch job.

Batch applications specify a set of steps and their execution order. Different batch frameworks may specify additional elements, like decision elements or groups of steps that run in parallel. The following sections describe steps in more detail and provide information about other common characteristics of batch frameworks.

Steps in Batch Jobs

A step is an independent and sequential phase of a batch job. Batch jobs contain chunk-oriented steps and task-oriented steps.

  • Chunk-oriented steps (chunk steps) process data by reading items from a data source, applying some business logic to each item, and storing the results. Chunk steps read and process one item at a time and group the results into a chunk. The results are stored when the chunk reaches a configurable size. Chunk-oriented processing makes storing results more efficient and facilitates transaction demarcation.

    Chunk steps have three parts.

    • The input retrieval part reads one item at a time from a data source, such as entries on a database, files in a directory, or entries in a log file.

    • The business processing part manipulates one item at a time using the business logic defined by the application. Examples include filtering, formatting, and accessing data from the item for computing a result.

    • The output writing part stores a chunk of processed items at a time.

Chunk steps are often long-running because they process large amounts of data. Batch frameworks enable chunk steps to bookmark their progress using checkpoints. A chunk step that is interrupted can be restarted from the last checkpoint. The input retrieval and output writing parts of a chunk step save their current position after the processing of each chunk, and can recover it when the step is restarted.

Figure 1, “Chunk Steps in a Batch Job” shows the three parts of two steps in a batch job.

This figure shows a batch job that contains two chunk steps: step A and step B. Step A has the three parts of a chunk-oriented step: input retrieval A, business processing A, and output writing A. Step B also has the three parts of a chunk-oriented step: input retrieval B, business processing B, and output writing B.
Figure 1. Chunk Steps in a Batch Job

For example, the phone billing application consists of two chunk steps.

  • In the first step, the input retrieval part reads call records from the registry; the business processing part associates each call with a bill and creates a bill if one does not exist for an account; and the output writing part stores each bill in a database.

  • In the second step, the input retrieval part reads bills from the database; the business processing part calculates the tax and total amount due for each bill; and the output writing part updates the database records and generates printable versions of each bill.

This application could also contain a task step that cleaned up the files from the bills generated for the previous month.

Parallel Processing

Batch jobs often process large amounts of data or perform computationally expensive operations. Batch applications can benefit from parallel processing in two scenarios.

  • Steps that do not depend on each other can run on different threads.

  • Chunk-oriented steps where the processing of each item does not depend on the results of processing previous items can run on more than one thread.

Batch frameworks provide mechanisms for developers to define groups of independent steps and to split chunk-oriented steps in parts that can run in parallel.

Status and Decision Elements

Batch frameworks keep track of a status for every step in a job. The status indicates if a step is running or if it has completed. If the step has completed, the status indicates one of the following.

  • The execution of the step was successful.

  • The step was interrupted.

  • An error occurred in the execution of the step.

In addition to steps, batch jobs can also contain decision elements. Decision elements use the exit status of the previous step to determine the next step or to terminate the batch job. Decision elements set the status of the batch job when terminating it. Like a step, a batch job can terminate successfully, be interrupted, or fail.

Figure 2, “Steps and Decision Elements in a Job” shows an example of a job that contains chunk steps, task steps and a decision element.

This figure shows a batch job that contains two chunk steps, a task step and a decision element. The job starts with chunk step A, continues with chunk step B, and then decision element D evaluates condition 1. The condition is based on the status of step B. If condition 1 is true, the job terminates; otherwise the job continues with step C and then the job ends.
Figure 2. Steps and Decision Elements in a Job

Batch Framework Functionality

Batch applications have the following common requirements.

  • Define jobs, steps, decision elements, and the relationships between them.

  • Execute some groups of steps or parts of a step in parallel.

  • Maintain state information for jobs and steps.

  • Launch jobs and resume interrupted jobs.

  • Handle errors.

Batch frameworks provide the batch execution infrastructure that addresses the common requirements of all batch applications, enabling developers to concentrate on the business logic of their applications. Batch frameworks consist of a format to specify jobs and steps, an application programming interface (API), and a service available at runtime that manages the execution of batch jobs.

Batch Processing in Jakarta EE

This section lists the components of the batch processing framework in Jakarta EE and provides an overview of the steps you have to follow to create a batch application.

The Batch Processing Framework

Jakarta EE includes a batch processing framework that consists of the following elements:

  • A batch runtime that manages the execution of jobs

  • A job specification language based on XML

  • A Java API to interact with the batch runtime

  • A Java API to implement steps, decision elements, and other batch artifacts

Batch applications in Jakarta EE contain XML files and Java classes. The XML files define the structure of a job in terms of batch artifacts and the relationships between them. (A batch artifact is a part of a chunk-oriented step, a task-oriented step, a decision element, or another component of a batch application). The Java classes implement the application logic of the batch artifacts defined in the XML files. The batch runtime parses the XML files and loads the batch artifacts as Java classes to run the jobs in a batch application.

Creating Batch Applications

The process for creating a batch application in Jakarta EE is the following.

  1. Design the batch application.

    1. Identify the input sources, the format of the input data, the desired final result, and the required processing phases.

    2. Organize the application as a job with chunk-oriented steps, task-oriented steps, and decision elements. Determine the dependencies between them.

    3. Determine the order of execution in terms of transitions between steps.

    4. Identify steps that can run in parallel and steps that can run in more than one thread.

  2. Create the batch artifacts as Java classes by implementing the interfaces specified by the framework for steps, decision elements, and so on. These Java classes contain the code to read data from input sources, format items, process items, and store results. Batch artifacts can access context objects from the batch runtime using dependency injection.

  3. Define jobs, steps, and their execution flow in XML files using the Job Specification Language. The elements in the XML files reference batch artifacts implemented as Java classes. The batch artifacts can access properties declared in the XML files, such as names of files and databases.

  4. Use the Java API provided by the batch runtime to launch the batch application.

The following sections describe in detail how to use the components of the batch processing framework in Jakarta EE to create batch applications.

Elements of a Batch Job

A batch job can contain one or more of the following elements:

  • Steps

  • Flows

  • Splits

  • Decision elements

Steps are described in Introduction to Batch Processing, and can be chunk-oriented or task-oriented. Chunk-oriented steps can be partitioned steps. In a partitioned chunk step, the processing of one item does not depend on other items, so these steps can run in more than one thread.

A flow is a sequence of steps that execute as a unit. A sequence of related steps can be grouped together into a flow. The steps in a flow cannot transition to steps outside the flow. The flow transitions to the next element when its last step completes.

A split is a set of flows that execute in parallel; each flow runs on a separate thread. The split transitions to the next element when all its flows complete.

Decision elements use the exit status of the previous step to determine the next step or to terminate the batch job.

Properties and Parameters

Jobs and steps can have a number of properties associated with them. You define properties in the job definition file, and batch artifacts access these properties using context objects from the batch runtime. Using properties in this manner enables you to decouple static parameters of the job from the business logic and to reuse batch artifacts in different job definition files.

Specifying properties is described in Using the Job Specification Language, and accessing properties in batch artifacts is described in Creating Batch Artifacts.

Jakarta EE applications can also pass parameters to a job when they submit it to the batch runtime. This enables you to specify dynamic parameters that are only known at runtime. Parameters are also necessary for partitioned steps, since each partition needs to know, for example, what range of items to process.

Specifying parameters when submitting jobs is described in Submitting Jobs to the Batch Runtime. Specifying parameters for partitioned steps and accessing them in batch artifacts is demonstrated in The phonebilling Example Application.

Job Instances and Job Executions

A job definition can have multiple instances, each with different parameters. A job execution is an attempt to run a job instance. The batch runtime maintains information about job instances and job executions, as described in Checking the Status of a Job.

Batch and Exit Status

The state of jobs, steps, splits, and flows is represented in the batch runtime as a batch status value. Batch status values are listed Batch Status Values. They are represented as strings.

Batch Status Values
Value Description

STARTING

The job has been submitted to the batch runtime.

STARTED

The job is running.

STOPPING

The job has been requested to stop.

STOPPED

The job has stopped.

FAILED

The job finished executing because of an error.

COMPLETED

The job finished executing successfully.

ABANDONED

The job was marked abandoned.

Jakarta EE applications can submit jobs and access the batch status of a job using the JobOperator interface, as described in Submitting Jobs to the Batch Runtime. Job definition files can refer to batch status values using the Job Specification Language (JSL), as described in Using the Job Specification Language. Batch artifacts can access batch status values using context objects, as described in Using the Context Objects from the Batch Runtime.

For flows, the batch status is that of its last step. For splits, the batch status is the following:

  • COMPLETED: If all its flows have a batch status of COMPLETED

  • FAILED: If any flow has a batch status of FAILED

  • STOPPED: If any flow has a batch status of STOPPED, and no flows have a batch status of FAILED

The batch status for jobs, steps, splits, and flows is set by the batch runtime. Jobs, steps, splits, and flows also have an exit status, which is a user-defined value based on the batch status. You can set the exit status inside batch artifacts or in the job definition file. You can access the exit status in the same manner as the batch status, described above. The default value for the exit status is the same as the batch status.

Simple Use Case

This section demonstrates how to define a simple job using the Job Specification Language (JSL) and how to implement the corresponding batch artifacts. Refer to the rest of the sections in this chapter for detailed descriptions of the elements in the batch framework.

The following job definition specifies a chunk step and a task step as follows:

<?xml version="1.0" encoding="UTF-8"?>
<job id="simplejob" xmlns="https://jakarta.ee/xml/ns/jakartaee"
                    version="2.0">
  <properties>
    <property name="input_file" value="input.txt"/>
    <property name="output_file" value="output.txt"/>
  </properties>
  <step id="mychunk" next="mytask">
    <chunk>
      <reader ref="MyReader"></reader>
      <processor ref="MyProcessor"></processor>
      <writer ref="MyWriter"></writer>
    </chunk>
  </step>
  <step id="mytask">
    <batchlet ref="MyBatchlet"></batchlet>
    <end on="COMPLETED"/>
  </step>
</job>

Chunk Step

In most cases, you have to implement a checkpoint class for chunk-oriented steps. The following class just keeps track of the line number in a text file:

public class MyCheckpoint implements Serializable {
    private long lineNum = 0;
    public void increase() { lineNum++; }
    public long getLineNum() { return lineNum; }
}

The following item reader implementation continues reading the input file from the provided checkpoint if the job was restarted. The items consist of each line in the text file (in more complex scenarios, the items are custom Java types and the input source can be a database):

@Dependent
@Named("MyReader")
public class MyReader implements jakarta.batch.api.chunk.ItemReader {
    private MyCheckpoint checkpoint;
    private BufferedReader breader;
    @Inject
    JobContext jobCtx;

    public MyReader() {}

    @Override
    public void open(Serializable ckpt) throws Exception {
        if (ckpt == null)
            checkpoint = new MyCheckpoint();
        else
            checkpoint = (MyCheckpoint) ckpt;
        String fileName = jobCtx.getProperties()
                                .getProperty("input_file");
        breader = new BufferedReader(new FileReader(fileName));
        for (long i = 0; i < checkpoint.getLineNum(); i++)
            breader.readLine();
    }

    @Override
    public void close() throws Exception {
        breader.close();
    }

    @Override
    public Object readItem() throws Exception {
        String line = breader.readLine();
        return line;
    }
}

In the following case, the item processor only converts the line to uppercase. More complex examples can process items in different ways or transform them into custom output Java types:

@Dependent
@Named("MyProcessor")
public class MyProcessor implements jakarta.batch.api.chunk.ItemProcessor {
    public MyProcessor() {}

    @Override
    public Object processItem(Object obj) throws Exception {
        String line = (String) obj;
        return line.toUpperCase();
    }
}
The batch processing API does not support generics. In most cases, you need to cast items to their specific type before processing them.

The item writer writes the processed items to the output file. It overwrites the output file if no checkpoint is provided; otherwise, it resumes writing at the end of the file. Items are written in chunks:

@Dependent
@Named("MyWriter")
public class MyWriter implements jakarta.batch.api.chunk.ItemWriter {
    private BufferedWriter bwriter;
    @Inject
    private JobContext jobCtx;

    @Override
    public void open(Serializable ckpt) throws Exception {
        String fileName = jobCtx.getProperties()
                                .getProperty("output_file");
        bwriter = new BufferedWriter(new FileWriter(fileName,
                                                    (ckpt != null)));
    }

    @Override
    public void writeItems(List<Object> items) throws Exception {
        for (int i = 0; i < items.size(); i++) {
            String line = (String) items.get(i);
            bwriter.write(line);
            bwriter.newLine();
        }
    }

    @Override
    public Serializable checkpointInfo() throws Exception {
        return new MyCheckpoint();
    }
}

Task Step

The task step displays the length of the output file. In more complex scenarios, task steps perform any task that does not fit the chunk processing programming model:

@Dependent
@Named("MyBatchlet")
public class MyBatchlet implements jakarta.batch.api.chunk.Batchlet {
    @Inject
    private JobContext jobCtx;

    @Override
    public String process() throws Exception {
        String fileName = jobCtx.getProperties()
                                .getProperty("output_file");
        System.out.println(""+(new File(fileName)).length());
        return "COMPLETED";
    }
}

Using the Job Specification Language

The Job Specification Language (JSL) enables you to define the steps in a job and their execution order using an XML file. The following example shows how to define a simple job that contains one chunk step and one task step:

<job id="loganalysis" xmlns="https://jakarta.ee/xml/ns/jakartaee"
                      version="2.0">
  <properties>
    <property name="input_file" value="input1.txt"/>
    <property name="output_file" value="output2.txt"/>
  </properties>

  <step id="logprocessor" next="cleanup">
    <chunk checkpoint-policy="item" item-count="10">
      <reader ref="com.example.pkg.LogItemReader"></reader>
      <processor ref="com.example.pkg.LogItemProcessor"></processor>
      <writer ref="com.example.pkg.LogItemWriter"></writer>
    </chunk>
  </step>

  <step id="cleanup">
    <batchlet ref="com.example.pkg.CleanUp"></batchlet>
    <end on="COMPLETED"/>
  </step>
</job>

This example defines the loganalysis batch job, which consists of the logprocessor chunk step and the cleanup task step. The logprocessor step transitions to the cleanup step, which terminates the job when completed.

The job element defines two properties, input_file and output_file. Specifying properties in this manner enables you to run a batch job with different configuration parameters without having to recompile its Java batch artifacts. The batch artifacts can access these properties using the context objects from the batch runtime.

The logprocessor step is a chunk step that specifies batch artifacts for the reader (LogItemReader), the processor (LogItemProcessor), and the writer (LogItemWriter). This step creates a checkpoint for every ten items processed.

The cleanup step is a task step that specifies the CleanUp class as its batch artifact. The job terminates when this step completes.

The following sections describe the elements of the Job Specification Language (JSL) in more detail and show the most common attributes and child elements.

The job Element

The job element is always the top-level element in a job definition file. Its main attributes are id and restartable. The job element can contain one properties element and zero or more of each of the following elements: listener, step, flow, and split. For example:

<job id="jobname" restartable="true">
  <listeners>
    <listener ref="com.example.pkg.ListenerBatchArtifact"/>
  </listeners>
  <properties>
    <property name="propertyName1" value="propertyValue1"/>
    <property name="propertyName2" value="propertyValue2"/>
  </properties>
  <step ...> ... </step>
  <step ...> ... </step>
  <decision ...> ... </decision>
  <flow ...> ... </flow>
  <split ...> ... </split>
</job>

The listener element specifies a batch artifact whose methods are invoked before and after the execution of the job. The batch artifact is an implementation of the jakarta.batch.api.listener.JobListener interface. See The Listener Batch Artifacts for an example of a job listener implementation.

The first step, flow, or split element inside the job element executes first.

The step Element

The step element can be a child of the job and flow elements. Its main attributes are id and next. The step element can contain the following elements.

  • One chunk element for chunk-oriented steps or one batchlet element for task-oriented steps.

  • One properties element (optional).

    This element specifies a set of properties that batch artifacts can access using batch context objects.

  • One listener element (optional); one listeners element if more than one listener is specified.

    This element specifies listener artifacts that intercept various phases of step execution.

    For chunk steps, the batch artifacts for these listeners can be implementations of the following interfaces: StepListener, ItemReadListener, ItemProcessListener, ItemWriteListener, ChunkListener, RetryReadListener, RetryProcessListener, RetryWriteListener, SkipReadListener, SkipProcessListener, and SkipWriteListener.

    For task steps, the batch artifact for these listeners must be an implementation of the StepListener interface.

    See The Listener Batch Artifacts for an example of an item processor listener implementation.

  • One partition element (optional).

    This element is used in partitioned steps which execute in more than one thread.

  • One end element if this is the last step in a job.

    This element sets the batch status to COMPLETED.

  • One stop element (optional) to stop a job at this step.

    This element sets the batch status to STOPPED.

  • One fail element (optional) to terminate a job at this step.

    This element sets the batch status to FAILED.

  • One or more next elements if the next attribute is not specified.

    This element is associated with an exit status and refers to another step, a flow, a split, or a decision element.

The following is an example of a chunk step:

<step id="stepA" next="stepB">
  <properties> ... </properties>
  <listeners>
    <listener ref="MyItemReadListenerImpl"/>
    ...
  </listeners>
  <chunk ...> ... </chunk>
  <partition> ... </partition>
  <end on="COMPLETED" exit-status="MY_COMPLETED_EXIT_STATUS"/>
  <stop on="MY_TEMP_ISSUE_EXIST_STATUS" restart="step0"/>
  <fail on="MY_ERROR_EXIT_STATUS" exit-status="MY_ERROR_EXIT_STATUS"/>
</step>

The following is an example of a task step:

<step id="stepB" next="stepC">
  <batchlet ...> ... </batchlet>
  <properties> ... </properties>
  <listener ref="MyStepListenerImpl"/>
</step>

The chunk Element

The chunk element is a child of the step element for chunk-oriented steps. The attributes of this element are listed in Attributes of the chunk Element.

Attributes of the chunk Element
Attribute Name Description Default Value

checkpoint-policy

Specifies how to commit the results of processing each chunk:

  • "item": the chunk is committed after processing item-count items

  • "custom": the chunk is committed according to a checkpoint algorithm specified with the checkpoint-algorithm element

The checkpoint is updated when the results of a chunk are committed.

Every chunk is processed in a global Jakarta EE transaction. If the processing of one item in the chunk fails, the transaction is rolled back and no processed items from this chunk are stored.

"item"

item-count

Specifies the number of items to process before committing the chunk and taking a checkpoint.

10

time-limit

Specifies the number of seconds before committing the chunk and taking a checkpoint when checkpoint-policy="item".

If item-count items have not been processed by time-limit seconds, the chunk is committed and a checkpoint is taken.

0 (no limit)

buffer-items

Specifies if processed items are buffered until it is time to take a checkpoint. If true, a single call to the item writer is made with a list of the buffered items before committing the chunk and taking a checkpoint.

true

skip-limit

Specifies the number of skippable exceptions to skip in this step during chunk processing. Skippable exception classes are specified with the skippable-exception-classes element.

No limit

retry-limit

Specifies the number of attempts to execute this step if retryable exceptions occur. Retryable exception classes are specified with the retryable-exception-classes element.

No limit

The chunk element can contain the following elements.

  • One reader element.

    This element specifies a batch artifact that implements the ItemReader interface.

  • One processor element.

    This element specifies a batch artifact that implements the ItemProcessor interface.

  • One writer element.

    This element specifies a batch artifact that implements the ItemWriter interface.

  • One checkpoint-algorithm element (optional).

    This element specifies a batch artifact that implements the CheckpointAlgorithm interface and provides a custom checkpoint policy.

  • One skippable-exception-classes element (optional).

    This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that chunk processing should skip. The skip-limit attribute from the chunk element specifies the maximum number of skipped exceptions.

  • One retryable-exception-classes element (optional).

    This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that chunk processing will retry. The retry-limit attribute from the chunk element specifies the maximum number of attempts.

  • One no-rollback-exception-classes element (optional).

    This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that should not cause the batch runtime to roll back the current chunk, but to retry the current operation without a rollback instead.

    For exception types not specified in this element, the current chunk is rolled back by default when an exception occurs.

The following is an example of a chunk-oriented step:

<step id="stepC" next="stepD">
  <chunk checkpoint-policy="item" item-count="5" time-limit="180"
         buffer-items="true" skip-limit="10" retry-limit="3">
    <reader ref="pkg.MyItemReaderImpl"></reader>
    <processor ref="pkg.MyItemProcessorImpl"></processor>
    <writer ref="pkg.MyItemWriterImpl"></writer>
    <skippable-exception-classes>
      <include class="pkg.MyItemException"/>
      <exclude class="pkg.MyItemSeriousSubException"/>
    </skippable-exception-classes>
    <retryable-exception-classes>
      <include class="pkg.MyResourceTempUnavailable"/>
    </retryable-exception-classes>
  </chunk>
</step>

This example defines a chunk step and specifies its reader, processor, and writer artifacts. The step updates a checkpoint and commits each chunk after processing five items. It skips all MyItemException exceptions and all its subtypes, except for MyItemSeriousSubException, up to a maximum of ten skipped exceptions. The step retries a chunk when a MyResourceTempUnavailable exception occurs, up to a maximum of three attempts.

The batchlet Element

The batchlet element is a child of the step element for task-oriented steps. This element only has the ref attribute, which specifies a batch artifact that implements the Batchlet interface. The batch element can contain a properties element.

The following is an example of a task-oriented step:

<step id="stepD" next="stepE">
  <batchlet ref="pkg.MyBatchletImpl">
    <properties>
      <property name="pname" value="pvalue"/>
    </properties>
  </batchlet>
</step>

This example defines a batch step and specifies its batch artifact.

The partition Element

The partition element is a child of the step element. It indicates that a step is partitioned. Most partitioned steps are chunk steps where the processing of each item does not depend on the results of processing previous items. You specify the number of partitions in a step and provide each partition with specific information on which items to process, such as the following.

  • A range of items. For example, partition 1 processes items 1 through 500, and partition 2 processes items 501 through 1000.

  • An input source. For example, partition 1 processes the items in input1.txt and partition 2 processes the items in input2.txt.

When the number of partitions, the number of items, and the input sources for a partitioned step are known at development or deployment time, you can use partition properties in the job definition file to specify partition-specific information and access these properties from the step batch artifacts. The runtime creates as many instances of the step batch artifacts (reader, processor, and writer) as partitions, and each artifact instance receives the properties specific to its partition.

In most cases, the number of partitions, the number of items, or the input sources for a partitioned step can only be determined at runtime. Instead of specifying partition-specific properties statically in the job definition file, you provide a batch artifact that can access your data sources at runtime and determine how many partitions are needed and what range of items each partition should process. This batch artifact is an implementation of the PartitionMapper interface. The batch runtime invokes this artifact and then uses the information it provides to instantiate the step batch artifacts (reader, writer, and processor) for each partition and to pass them partition-specific data as parameters.

The rest of this section describes the partition element in detail and shows two examples of job definition files: one that uses partition properties to specify a range of items for each partition, and one that relies on a PartitionMapper implementation to determine partition-specific information.

See The Phone Billing Chunk Step in The phonebilling Example Application for a complete example of a partitioned chunk step.

The partition element can contain the following elements.

  • One plan element, if the mapper element is not specified.

    This element defines the number of partitions, the number of threads, and the properties for each partition in the job definition file. The plan element is useful when this information is known at development or deployment time.

  • One mapper element, if the plan element is not specified.

    This element specifies a batch artifact that provides the number of partitions, the number of threads, and the properties for each partition. The batch artifact is an implementation of the PartitionMapper interface. You use this option when the information required for each partition is only known at runtime.

  • One reducer element (optional).

    This element specifies a batch artifact that receives control when a partitioned step begins, ends, or rolls back. The batch artifact enables you to merge results from different partitions and perform other related operations. The batch artifact is an implementation of the PartitionReducer interface.

  • One collector element (optional).

    This element specifies a batch artifact that sends intermediary results from each partition to a partition analyzer. The batch artifact sends the intermediary results after each checkpoint for chunk steps and at the end of the step for task steps. The batch artifact is an implementation of the PartitionCollector interface.

  • One analyzer element (optional).

    This element specifies a batch artifact that analyzes the intermediary results from the partition collector instances. The batch artifact is an implementation of the PartitionAnalyzer interface.

The following is an example of a partitioned step using the plan element:

<step id="stepE" next="stepF">
  <chunk>
    <reader ...></reader>
    <processor ...></processor>
    <writer ...></writer>
  </chunk>
  <partition>
    <plan partitions="2" threads="2">
      <properties partition="0">
        <property name="firstItem" value="0"/>
        <property name="lastItem" value="500"/>
      </properties>
      <properties partition="1">
        <property name="firstItem" value="501"/>
        <property name="lastItem" value="999"/>
      </properties>
    </plan>
  </partition>
  <reducer ref="MyPartitionReducerImpl"/>
  <collector ref="MyPartitionCollectorImpl"/>
  <analyzer ref="MyPartitionAnalyzerImpl"/>
</step>

In this example, the plan element specifies the properties for each partition in the job definition file.

The following example uses a mapper element instead of a plan element. The PartitionMapper implementation dynamically provides the same information as the plan element provides in the job definition file:

<step id="stepE" next="stepF">
  <chunk>
    <reader ...></reader>
    <processor ...></processor>
    <writer ...></writer>
  </chunk>
  <partition>
    <mapper ref="MyPartitionMapperImpl"/>
    <reducer ref="MyPartitionReducerImpl"/>
    <collector ref="MyPartitionCollectorImpl"/>
    <analyzer ref="MyPartitionAnalyzerImpl"/>
  </partition>
</step>

Refer to The phonebilling Example Application for an example implementation of the PartitionMapper interface.

The flow Element

The flow element can be a child of the job, flow, and split elements. Its attributes are id and next. Flows can transition to flows, steps, splits, and decision elements. The flow element can contain the following elements:

  • One or more step elements

  • One or more flow elements (optional)

  • One or more split elements (optional)

  • One or more decision elements (optional)

The last step in a flow is the one with no next attribute or next element. Steps and other elements in a flow cannot transition to elements outside the flow.

The following is an example of the flow element:

<flow id="flowA" next="stepE">
  <step id="flowAstepA" next="flowAstepB">...</step>
  <step id="flowAstepB" next="flowAflowC">...</step>
  <flow id="flowAflowC" next="flowAsplitD">...</flow>
  <split id="flowAsplitD" next="flowAstepE">...</split>
  <step id="flowAstepE">...</step>
</flow>

This example flow contains three steps, one flow, and one split. The last step does not have the next attribute. The flow transitions to stepE when its last step completes.

The split Element

The split element can be a child of the job and flow elements. Its attributes are id and next. Splits can transition to splits, steps, flows, and decision elements. The split element can only contain one or more flow elements that can only transition to other flow elements in the split.

The following is an example of a split with three flows that execute concurrently:

<split id="splitA" next="stepB">
  <flow id="splitAflowA">...</flow>
  <flow id="splitAflowB">...</flow>
  <flow id="splitAflowC">...</flow>
</split>

The decision Element

The decision element can be a child of the job and flow elements. Its attributes are id and next. Steps, flows, and splits can transition to a decision element. This element specifies a batch artifact that decides the next step, flow, or split to execute based on information from the execution of the previous step, flow, or split. The batch artifact implements the Decider interface. The decision element can contain the following elements.

  • One or more end elements (optional).

    This element sets the batch status to COMPLETED.

  • One or more stop elements (optional).

    This element sets the batch status to STOPPED.

  • One or more fail elements (optional).

    This element sets the batch status to FAILED.

  • One or more next elements (optional).

  • One properties element (optional).

The following is an example of the decider element:

<decision id="decisionA" ref="MyDeciderImpl">
  <fail on="FAILED" exit-status="FAILED_AT_DECIDER"/>
  <end on="COMPLETED" exit-status="COMPLETED_AT_DECIDER"/>
  <stop on="MY_TEMP_ISSUE_EXIST_STATUS" restart="step2"/>
</decision>

Creating Batch Artifacts

After you define a job in terms of its batch artifacts using the Job Specification Language (JSL), you create these artifacts as Java classes that implement the interfaces in the jakarta.batch.api package and its subpackages.

This section lists the main batch artifact interfaces, demonstrates how to access context objects from the batch runtime, and provides some examples.

Batch Artifact Interfaces

The following tables list the interfaces that you implement to create batch artifacts. The interface implementations are referenced from the elements described in Using the Job Specification Language.

Main Batch Artifact Interfaces lists the interfaces to implement batch artifacts for chunk steps, task steps, and decision elements.

Partition Batch Artifact Interfaces lists the interfaces to implement batch artifacts for partitioned steps.

Listener Batch Artifact Interfaces lists the interfaces to implement batch artifacts for job and step listeners.

Main Batch Artifact Interfaces
Package Interface Description

jakarta.batch.api

Batchlet

Implements the business logic of a task-oriented step. It is referenced from the batchlet element.

jakarta.batch.api

Decider

Decides the next step, flow, or split to execute based on information from the execution of the previous step, flow, or split. It is referenced from the decision element.

jakarta.batch.api.chunk

CheckPointAlgorithm

Implements a custom checkpoint policy for chunk steps. It is referenced from the checkpoint-algorithm element inside the chunk element.

jakarta.batch.api.chunk

ItemReader

Reads items from an input source in a chunk step. It is referenced from the reader element inside the chunk element.

jakarta.batch.api.chunk

ItemProcessor

Processes input items to obtain output items in chunk steps. It is referenced from the processor element inside the chunk element.

jakarta.batch.api.chunk

ItemWriter

Writes output items in chunk steps. It is referenced from the writer element inside the chunk element.

Partition Batch Artifact Interfaces
Package Interface Description

jakarta.batch.api.partition

PartitionPlan

Provides details on how to execute a partitioned step, such as the number of partitions, the number of threads, and the parameters for each partition. This artifact is not referenced directly from the job definition file.

jakarta.batch.api.partition

PartitionMapper

Provides a PartitionPlan object. It is referenced from the mapper element inside the partition element.

jakarta.batch.api.partition

PartitionReducer

Receives control when a partitioned step begins, ends, or rolls back. It is referenced from the reducer element inside the partition element.

jakarta.batch.api.partition

PartitionCollector

Sends intermediary results from each partition to a partition analyzer. It is referenced from the collector element inside the partition element.

jakarta.batch.api.partition

PartitionAnalyzer

Processes data and final results from each partition. It is referenced from the analyzer element inside the partition element.

Listener Batch Artifact Interfaces
Package Interface Description

jakarta.batch.api.listener

JobListener

Intercepts job execution before and after running a job. It is referenced from the listener element inside the job element.

jakarta.batch.api.listener

StepListener

Intercepts step execution before and after running a step. It is referenced from the listener element inside the step element

jakarta.batch.api.chunk.listener

ChunkListener

Intercepts chunk processing in chunk steps before and after processing each chunk, and on errors. It is referenced from the listener element inside the step element.

jakarta.batch.api.chunk.listener

ItemReadListener

Intercepts item reading in chunk steps before and after reading each item, and on errors. It is referenced from the listener element inside the step element.

jakarta.batch.api.chunk.listener

ItemProcessListener

Intercepts item processing in chunk steps before and after processing each item, and on errors. It is referenced from the listener element inside the step element.

jakarta.batch.api.chunk.listener

ItemWriteListener

Intercepts item writing in chunk steps before and after writing each item, and on errors. It is referenced from the listener element inside the step element.

jakarta.batch.api.chunk.listener

RetryReadListener

Intercepts retry item reading in chunk steps when an exception occurs. It is referenced from the listener element inside the step element.

jakarta.batch.api.chunk.listener

RetryProcessListener

Intercepts retry item processing in chunk steps when an exception occurs. It is referenced from the listener element inside the step element.

jakarta.batch.api.chunk.listener

RetryWriteListener

Intercepts retry item writing in chunk steps when an exception occurs. It is referenced from the listener element inside the step element.

jakarta.batch.api.chunk.listener

SkipReadListener

Intercepts skippable exception handling for item readers in chunk steps. It is referenced from the listener element inside the step element.

jakarta.batch.api.chunk.listener

SkipProcessListener

Intercepts skippable exception handling for item processors in chunk steps. It is referenced from the listener element inside the step element.

jakarta.batch.api.chunk.listener

SkipWriteListener

Intercepts skippable exception handling for item writers in chunk steps. It is referenced from the listener element inside the step element.

Dependency Injection in Batch Artifacts

To ensure that Jakarta Contexts and Dependency Injection (CDI) works in your batch artifacts, follow these steps.

  1. Define your batch artifact implementations as CDI named beans using the Named annotation.

    For example, define an item reader implementation in a chunk step as follows:

    @Named("MyItemReaderImpl")
    public class MyItemReaderImpl implements ItemReader {
        /* ... Override the ItemReader interface methods ... */
    }
  2. Provide a public, empty, no-argument constructor for your batch artifacts.

    For example, provide the following constructor for the artifact above:

    public MyItemReaderImpl() {}
  3. Specify the CDI name for the batch artifacts in the job definition file, instead of using the fully qualified name of the class.

    For example, define the step for the artifact above as follows:

    <step id="stepA" next="stepB">
      <chunk>
        <reader ref="MyItemReaderImpl"></reader>
        ...
      </chunk>
    </step>

    This example uses the CDI name (MyItemReaderImpl) instead of the fully qualified name of the class (com.example.pkg.MyItemReaderImpl) to specify a batch artifact.

  4. Ensure that your module is a CDI bean archive by annotating your batch artifacts with the jakarta.enterprise.context.Dependent annotation or by including an empty beans.xml deployment description with your application. For example, the following batch artifact is annotated with @Dependent:

    @Dependent
    @Named("MyItemReaderImpl")
    public class MyItemReaderImpl implements ItemReader { ... }
Jakarta Contexts and Dependency Injection (CDI) is required in order to access context objects from the batch runtime in batch artifacts.

You may encounter the following errors if you do not follow this procedure.

  • The batch runtime cannot locate some batch artifacts.

  • The batch artifacts throw null pointer exceptions when accessing injected objects.

Using the Context Objects from the Batch Runtime

The batch runtime provides context objects that implement the JobContext and StepContext interfaces in the jakarta.batch.runtime.context package. These objects are associated with the current job and step, respectively, and enable you to do the following:

  • Get information from the current job or step, such as its name, instance ID, execution ID, batch status, and exit status

  • Set the user-defined exit status

  • Store user data

  • Get property values from the job or step definition

You can inject context objects from the batch runtime inside batch artifact implementations like item readers, item processors, item writers, batchlets, listeners, and so on. The following example demonstrates how to access property values from the job definition file in an item reader implementation:

@Dependent
@Named("MyItemReaderImpl")
public class MyItemReaderImpl implements ItemReader {
    @Inject
    JobContext jobCtx;

    public MyItemReaderImpl() {}

    @Override
    public void open(Serializable checkpoint) throws Exception {
        String fileName = jobCtx.getProperties()
                                .getProperty("log_file_name");
        ...
    }
    ...
}

See Dependency Injection in Batch Artifacts for instructions on how to define your batch artifacts to use dependency injection.

Do not access batch context objects inside artifact constructors.

Because the job does not run until you submit it to the batch runtime, the batch context objects are not available when CDI instantiates your artifacts upon loading your application. The instantiation of these beans fails and the batch runtime cannot find your batch artifacts when your application submits the job.

Submitting Jobs to the Batch Runtime

The JobOperator interface in the jakarta.batch.operations package enables you to submit jobs to the batch runtime and obtain information about existing jobs. This interface provides the following functionality.

  • Obtain the names of all known jobs.

  • Start, stop, restart, and abandon jobs.

  • Obtain job instances and job executions.

The BatchRuntime class in the jakarta.batch.runtime package provides the getJobOperator factory method to obtain JobOperator objects.

Starting a Job

The following example code demonstrates how to obtain a JobOperator object and submit a batch job:

JobOperator jobOperator = BatchRuntime.getJobOperator();
Properties props = new Properties();
props.setProperty("parameter1", "value1");
...
long execID = jobOperator.start("simplejob", props);

The first argument of the JobOperator.start method is the name of the job as specified in its job definition file. The second parameter is a Properties object that represents the parameters for this job execution. You can use job parameters to pass to a job information that is only known at runtime.

Checking the Status of a Job

The JobExecution interface in the jakarta.batch.runtime package provides methods to obtain information about submitted jobs. This interface provides the following functionality.

  • Obtain the batch and exit status of a job execution.

  • Obtain the time the execution was started, updated, or ended.

  • Obtain the job name.

  • Obtain the execution ID.

The following example code demonstrates how to obtain the batch status of a job using its execution ID:

JobExecution jobExec = jobOperator.getJobExecution(execID);
String status = jobExec.getBatchStatus().toString();

Invoking the Batch Runtime in Your Application

The component from which you invoke the batch runtime depends on the architecture of your particular application. For example, you can invoke the batch runtime from an enterprise bean, a servlet, a managed bean, and so on.

See The webserverlog Example Application and The phonebilling Example Application for details on how to invoke the batch runtime from a managed bean driven by a Jakarta Faces user interface.

Packaging Batch Applications

Job definition files and batch artifacts do not require separate packaging and can be included in any Jakarta EE application.

Package the batch artifact classes with the rest of the classes of your application, and include the job definition files in one of the following directories:

  • META-INF/batch-jobs/ for jar packages

  • WEB-INF/classes/META-INF/batch-jobs/ for war packages

The name of each job definition file must match its job ID. For example, if you define a job as follows, and you are packaging your application as a WAR file, include the job definition file in WEB-INF/classes/META-INF/batch-jobs/simplejob.xml:

<job id="simplejob" xmlns="https://jakarta.ee/xml/ns/jakartaee"
                    version="2.0">
  ...
</job>

The webserverlog Example Application

The webserverlog example application, located in the jakartaee-examples/tutorial/batch/webserverlog/ directory, demonstrates how to use the batch framework in Jakarta EE to analyze the log file from a web server. This example application reads a log file and finds what percentage of page views from tablet devices are product sales.

Architecture of the webserverlog Example Application

The webserverlog example application consists of the following elements:

  • A job definition file (webserverlog.xml) that uses the Job Specification Language (JSL) to define a batch job with a chunk step and a task step. The chunk step acts as a filter, and the task step calculates statistics on the remaining entries.

  • A log file (log1.txt) that serves as input data to the batch job.

  • Two Java classes (LogLine and LogFilteredLine) that represent input items and output items for the chunk step.

  • Three batch artifacts (LogLineReader, LogLineProcessor, and LogFilteredLineWriter) that implement the chunk step of the application. This step reads items from the web server log file, filters them by the web browser used by the client, and writes the results to a text file.

  • Two batch artifacts (InfoJobListener and InfoItemProcessListener) that implement two simple listeners.

  • A batch artifact (MobileBatchlet.java) that calculates statistics on the filtered items.

  • Two Facelets pages (index.xhtml and jobstarted.xhtml) that provide the front end of the batch application. The first page shows the log file that will be processed by the batch job, and the second page enables the user to check on the status of the job and shows the results.

  • A managed bean (JsfBean) that is accessed from the Facelets pages. The bean submits the job to the batch runtime, checks on the status of the job, and reads the results from a text file.

The Job Definition File

The webserverlog.xml job definition file is located in the WEB-INF/classes/META-INF/batch-jobs/ directory. The file specifies seven job-level properties and two steps:

<?xml version="1.0" encoding="UTF-8"?>
<job id="webserverlog" xmlns="https://jakarta.ee/xml/ns/jakartaee"
     version="2.0">
    <properties>
        <property name="log_file_name" value="log1.txt"/>
        <property name="filtered_file_name" value="filtered1.txt"/>
        <property name="num_browsers" value="2"/>
        <property name="browser_1" value="Tablet Browser D"/>
        <property name="browser_2" value="Tablet Browser E"/>
        <property name="buy_page" value="/auth/buy.html"/>
        <property name="out_file_name" value="result1.txt"/>
    </properties>
    <listeners>
        <listener ref="InfoJobListener"/>
    </listeners>
    <step id="mobilefilter" next="mobileanalyzer"> ... </step>
    <step id="mobileanalyzer"> ... </step>
</job>

The first step is defined as follows:

<step id="mobilefilter" next="mobileanalyzer">
    <listeners>
        <listener ref="InfoItemProcessListeners"/>
    </listeners>
    <chunk checkpoint-policy="item" item-count="10">
        <reader ref="LogLineReader"></reader>
        <processor ref="LogLineProcessor"></processor>
        <writer ref="LogFilteredLineWriter"></writer>
    </chunk>
</step>

This step is a normal chunk step that specifies the batch artifacts that implement each phase of the step. The batch artifact names are not fully qualified class names, so the batch artifacts are CDI beans annotated with @Named.

The second step is defined as follows:

<step id="mobileanalyzer">
    <batchlet ref="MobileBatchlet"></batchlet>
    <end on="COMPLETED"/>
</step>

This step is a task step that specifies the batch artifact that implements it. This is the last step of the job.

The LogLine and LogFilteredLine Items

The LogLine class represents entries in the web server log file and it is defined as follows:

public class LogLine {
    private final String datetime;
    private final String ipaddr;
    private final String browser;
    private final String url;

    /* ... Constructor, getters, and setters ... */
}

The LogFileteredLine class is similar to this class but only has two fields: the IP address of the client and the URL.

The Chunk Step Batch Artifacts

The first step is composed of the LogLineReader, LogLineProcessor, and LogFilteredLineWriter batch artifacts.

The LogLineReader artifact reads records from the web server log file:

@Dependent
@Named("LogLineReader")
public class LogLineReader implements ItemReader {
    private ItemNumberCheckpoint checkpoint;
    private String fileName;
    private BufferedReader breader;
    @Inject
    private JobContext jobCtx;

    public LogLineReader() { }

    /* ... Override the open, close, readItem, and
     *     checkpointInfo methods ... */
}

The open method reads the log_file_name property and opens the log file with a buffered reader. In this example, the log file has been included with the application under webserverlog/WEB-INF/classes/log1.txt:

fileName = jobCtx.getProperties().getProperty("log_file_name");
ClassLoader classLoader = Thread.currentThread().getContextClassLoader();
InputStream iStream = classLoader.getResourceAsStream(fileName);
breader = new BufferedReader(new InputStreamReader(iStream));

If a checkpoint object is provided, the open method advances the reader up to the last checkpoint. Otherwise, this method creates a new checkpoint object. The checkpoint object keeps track of the line number from the last committed chunk.

The readItem method returns a new LogLine object or null at the end of the log file:

@Override
public Object readItem() throws Exception {
    String entry = breader.readLine();
    if (entry != null) {
        checkpoint.nextLine();
        return new LogLine(entry);
    } else {
        return null;
    }
}

The LogLineProcessor artifact obtains a list of browsers from the job properties and filters the log entries according to the list:

@Override
public Object processItem(Object item) {
    /* Obtain a list of browsers we are interested in */
    if (nbrowsers == 0) {
        Properties props = jobCtx.getProperties();
        nbrowsers = Integer.parseInt(props.getProperty("num_browsers"));
        browsers = new String[nbrowsers];
        for (int i = 1; i < nbrowsers + 1; i++)
            browsers[i - 1] = props.getProperty("browser_" + i);
    }

    LogLine logline = (LogLine) item;
    /* Filter for only the mobile/tablet browsers as specified */
    for (int i = 0; i < nbrowsers; i++) {
        if (logline.getBrowser().equals(browsers[i])) {
            return new LogFilteredLine(logline);
        }
    }
    return null;
}

The LogFilteredLineWriter artifact reads the name of the output file from the job properties. The open method opens the file for writing. If a checkpoint object is provided, the artifact continues writing at the end of the file; otherwise, it overwrites the file if it exists. The writeItems method writes filtered items to the output file:

@Override
public void writeItems(List<Object> items) throws Exception {
    /* Write the filtered lines to the output file */
    for (int i = 0; i < items.size(); i++) {
        LogFilteredLine filtLine = (LogFilteredLine) items.get(i);
        bwriter.write(filtLine.toString());
        bwriter.newLine();
    }
}

The Listener Batch Artifacts

The InfoJobListener batch artifact implements a simple listener that writes log messages when the job starts and when it ends:

@Dependent
@Named("InfoJobListener")
public class InfoJobListener implements JobListener {
    ...
    @Override
    public void beforeJob() throws Exception {
        logger.log(Level.INFO, "The job is starting");
    }

    @Override
    public void afterJob() throws Exception { ... }
}

The InfoItemProcessListener batch artifact implements the ItemProcessListener interface for chunk steps:

@Dependent
@Named("InfoItemProcessListener")
public class InfoItemProcessListener implements ItemProcessListener {
    ...
    @Override
    public void beforeProcess(Object o) throws Exception {
        LogLine logline = (LogLine) o;
        llogger.log(Level.INFO, "Processing entry {0}", logline);
    }
    ...
}

The Task Step Batch Artifact

The task step is implemented by the MobileBatchlet artifact, which computes what percentage of the filtered log entries are purchases:

@Override
public String process() throws Exception {
    /* Get properties from the job definition file */
    ...
    /* Count from the output of the previous chunk step */
    breader = new BufferedReader(new FileReader(fileName));
    String line = breader.readLine();
    while (line != null) {
        String[] lineSplit = line.split(", ");
        if (buyPage.compareTo(lineSplit[1]) == 0)
            pageVisits++;
        totalVisits++;
        line = breader.readLine();
    }
    breader.close();
    /* Write the result */
    ...
}

The Jakarta Faces Pages

The index.xhtml page contains a text area that shows the web server log. The page provides a button for the user to submit the batch job and navigate to the next page:

<body>
    ...
    <textarea cols="90" rows="25"
              readonly="true">#{jsfBean.getInputLog()}</textarea>
    <p> </p>
    <h:form>
        <h:commandButton value="Start Batch Job"
                         action="#{jsfBean.startBatchJob()}" />
    </h:form>
</body>

This page calls the methods of the managed bean to show the log file and submit the batch job.

The jobstarted.xhtml page provides a button to check the current status of the batch job and displays the results when the job finishes:

<p>Current Status of the Job: <b>#{jsfBean.jobStatus}</b></p>
<p>#{jsfBean.showResults()}</p>
<h:form>
    <h:commandButton value="Check Status"
                     action="jobstarted"
                     rendered="#{jsfBean.completed==false}" />
</h:form>

The Managed Bean

The JsfBean managed bean submits the job to the batch runtime, checks on the status of the job, and reads the results from a text file.

The startBatchJob method submits the job to the batch runtime:

/* Submit the batch job to the batch runtime.
 * JSF Navigation method (return the name of the next page) */
public String startBatchJob() {
    jobOperator = BatchRuntime.getJobOperator();
    execID = jobOperator.start("webserverlog", null);
    return "jobstarted";
}

The getJobStatus method checks the status of the job:

/* Get the status of the job from the batch runtime */
public String getJobStatus() {
    return jobOperator.getJobExecution(execID).getBatchStatus().toString();
}

The showResults method reads the results from a text file.

Running the webserverlog Example Application

You can use either NetBeans IDE or Maven to build, package, deploy, and run the webserverlog example application.

To Run the webserverlog Example Application Using NetBeans IDE

  1. Make sure that GlassFish Server has been started (see Starting and Stopping GlassFish Server).

  2. From the File menu, choose Open Project.

  3. In the Open Project dialog box, navigate to:

    jakartaee-examples/tutorial/batch
  4. Select the webserverlog folder.

  5. Click Open Project.

  6. In the Projects tab, right-click the webserverlog project and select Run.

    This command builds and packages the application into a WAR file, webserverlog.war, located in the target/ directory; deploys it to the server; and launches a web browser window at the following URL:

    http://localhost:8080/webserverlog/

To Run the webserverlog Example Application Using Maven

  1. Make sure that GlassFish Server has been started (see Starting and Stopping GlassFish Server).

  2. In a terminal window, go to:

    jakartaee-examples/tutorial/batch/webserverlog/
  3. Enter the following command to deploy the application:

    mvn install
  4. Open a web browser window at the following URL:

    http://localhost:8080/webserverlog/

The phonebilling Example Application

The phonebilling example application, located in the jakartaee-examples/tutorial/batch/phonebilling/ directory, demonstrates how to use the batch framework in Jakarta EE to implement a phone billing system. This example application processes a log file of phone calls and creates a bill for each customer.

Architecture of the phonebilling Example Application

The phonebilling example application consists of the following elements.

  • A job definition file (phonebilling.xml) that uses the Job Specification Language (JSL) to define a batch job with two chunk steps. The first step reads call records from a log file and associates them with a bill. The second step computes the amount due and writes each bill to a text file.

  • A Java class (CallRecordLogCreator) that creates the log file for the batch job. This is an auxiliary component that does not demonstrate any key functionality in this example.

  • Two Jakarta Persistence entities (CallRecord and PhoneBill) that represent call records and customer bills. The application uses a Jakarta Persistence entity manager to store instances of these entities in a database.

  • Three batch artifacts (CallRecordReader, CallRecordProcessor, and CallRecordWriter) that implement the first step of the application. This step reads call records from the log file, associates them with a bill, and stores them in a database.

  • Four batch artifacts (BillReader, BillProcessor, BillWriter, and BillPartitionMapper) that implement the second step of the application. This step is a partitioned step that gets each bill from the database, calculates the amount due, and writes it to a text file.

  • Two Facelets pages (index.xhtml and jobstarted.xhtml) that provide the front end of the batch application. The first page shows the log file that will be processed by the batch job, and the second page enables the user to check on the status of the job and shows the resulting bill for each customer.

  • A managed bean (JsfBean) that is accessed from the Facelets pages. The bean submits the job to the batch runtime, checks on the status of the job, and reads the text files for each bill.

The Job Definition File

The phonebilling.xml job definition file is located in the WEB-INF/classes/META-INF/batch-jobs/ directory. The file specifies three job-level properties and two steps:

<?xml version="1.0" encoding="UTF-8"?>
<job id="phonebilling" xmlns="https://jakarta.ee/xml/ns/jakartaee"
     version="2.0">
    <properties>
        <property name="log_file_name" value="log1.txt"/>
        <property name="airtime_price" value="0.08"/>
        <property name="tax_rate" value="0.07"/>
    </properties>
    <step id="callrecords" next="bills"> ... </step>
    <step id="bills"> ... </step>
</job>

The first step is defined as follows:

<step id="callrecords" next="bills">
    <chunk checkpoint-policy="item" item-count="10">
        <reader ref="CallRecordReader"></reader>
        <processor ref="CallRecordProcessor"></processor>
        <writer ref="CallRecordWriter"></writer>
    </chunk>
</step>

This step is a normal chunk step that specifies the batch artifacts that implement each phase of the step. The batch artifact names are not fully qualified class names, so the batch artifacts are CDI beans annotated with @Named.

The second step is defined as follows:

<step id="bills">
    <chunk checkpoint-policy="item" item-count="2">
        <reader ref="BillReader">
            <properties>
                <property name="firstItem" value="#{partitionPlan['firstItem']}"/>
                <property name="numItems" value="#{partitionPlan['numItems']}"/>
            </properties>
        </reader>
        <processor ref="BillProcessor"></processor>
        <writer ref="BillWriter"></writer>
    </chunk>
    <partition>
        <mapper ref="BillPartitionMapper"/>
    </partition>
    <end on="COMPLETED"/>
</step>

This step is a partitioned chunk step. The partition plan is specified through the BillPartitionMapper artifact instead of using the plan element.

The CallRecord and PhoneBill Entities

The CallRecord entity is defined as follows:

@Entity
public class CallRecord implements Serializable {
    @Id @GeneratedValue
    private Long id;
    @Temporal(TemporalType.DATE)
    private Date datetime;
    private String fromNumber;
    private String toNumber;
    private int minutes;
    private int seconds;
    private BigDecimal price;

    public CallRecord() { }

    public CallRecord(String datetime, String from,
            String to, int min, int sec)             throws ParseException { ... }

    public CallRecord(String jsonData) throws ParseException { ... }

    /* ... Getters and setters ... */
}

The id field is generated automatically by the Jakarta Persistence implementation to store and retrieve CallRecord objects to and from a database.

The second constructor creates a CallRecord object from an entry of JSON data in the log file using Jakarta JSON Processing. Log entries look as follows:

{"datetime":"03/01/2013 04:03","from":"555-0101",
"to":"555-0114","length":"03:39"}

The PhoneBill entity is defined as follows:

@Entity
public class PhoneBill implements Serializable {
    @Id
    private String phoneNumber;
    @OneToMany(fetch = FetchType.EAGER, cascade = CascadeType.PERSIST)
    @OrderBy("datetime ASC")
    private List<CallRecord> calls;
    private BigDecimal amountBase;
    private BigDecimal taxRate;
    private BigDecimal tax;
    private BigDecimal amountTotal;

    public PhoneBill() { }

    public PhoneBill(String number) {
        this.phoneNumber = number;
        calls = new ArrayList<>();
    }

    public void addCall(CallRecord call) {
        calls.add(call);
    }

    public void calculate(BigDecimal taxRate) { ... }

    /* ... Getters and setters ... */
}

The OneToMany annotation defines the relationship between a bill and its call records. The FetchType.EAGER attribute specifies that the collection should be retrieved eagerly. The CascadeType.PERSIST attribute indicates that the elements in the call list should be automatically persisted when the phone bill is persisted. The OrderBy annotation defines an order for retrieving the elements of the call list from the database.

The batch artifacts use instances of these two entities as items to read, process, and write.

For more information on Jakarta Persistence, see Introduction to Jakarta Persistence. For more information on Jakarta JSON Processing, see JSON Processing.

The Call Records Chunk Step

The first step is composed of the CallRecordReader, CallRecordProcessor, and CallRecordWriter batch artifacts.

The CallRecordReader artifact reads call records from the log file:

@Dependent
@Named("CallRecordReader")
public class CallRecordReader implements ItemReader {
    private ItemNumberCheckpoint checkpoint;
    private String fileName;
    private BufferedReader breader;
    @Inject
    JobContext jobCtx;

    /* ... Override the open, close, readItem,
     *     and checkpointInfo methods ... */
}

The open method reads the log_filename property and opens the log file with a buffered reader:

fileName = jobCtx.getProperties().getProperty("log_file_name");
breader = new BufferedReader(new FileReader(fileName));

If a checkpoint object is provided, the open method advances the reader up to the last checkpoint. Otherwise, this method creates a new checkpoint object. The checkpoint object keeps track of the line number from the last committed chunk.

The readItem method returns a new CallRecord object or null at the end of the log file:

@Override
public Object readItem() throws Exception {
    /* Read a line from the log file and
     * create a CallRecord from JSON */
    String callEntryJson = breader.readLine();
    if (callEntryJson != null) {
        checkpoint.nextItem();
        return new CallRecord(callEntryJson);
    } else
        return null;
}

The CallRecordProcessor artifact obtains the airtime price from the job properties, calculates the price of each call, and returns the call object. This artifact overrides only the processItem method.

The CallRecordWriter artifact associates each call record with a bill and stores the bill in the database. This artifact overrides the open, close, writeItems, and checkpointInfo methods. The writeItems method looks like this:

@Override
public void writeItems(List<Object> callList) throws Exception {

    for (Object callObject : callList) {
        CallRecord call = (CallRecord) callObject;
        PhoneBill bill = em.find(PhoneBill.class, call.getFromNumber());
        if (bill == null) {
            /* No bill for this customer yet, create one */
            bill = new PhoneBill(call.getFromNumber());
            bill.addCall(call);
            em.persist(bill);
        } else {
            /* Add call to existing bill */
            bill.addCall(call);
        }
    }
}

The Phone Billing Chunk Step

The second step is composed of the BillReader, BillProcessor, BillWriter, and BillPartitionMapper batch artifacts. This step gets the phone bills from the database, computes the tax and total amount due, and writes each bill to a text file. Since the processing of each bill is independent of the others, this step can be partitioned and run in more than one thread.

The BillPartitionMapper artifact specifies the number of partitions and the parameters for each partition. In this example, the parameters represent the range of items each partition should process. The artifact obtains the number of bills in the database to calculate these ranges. It provides a partition plan object that overrides the getPartitions and getPartitionProperties methods of the PartitionPlan interface. The getPartitions method looks like this:

@Override
public Properties[] getPartitionProperties() {
    /* Assign an (approximately) equal number of elements
     * to each partition. */
    long totalItems = getBillCount();
    long partItems = (long) totalItems / getPartitions();
    long remItems = totalItems % getPartitions();

    /* Populate a Properties array. Each Properties element
     * in the array corresponds to a partition. */
    Properties[] props = new Properties[getPartitions()];

    for (int i = 0; i < getPartitions(); i++) {
        props[i] = new Properties();
        props[i].setProperty("firstItem",
                String.valueOf(i * partItems));
        /* Last partition gets the remainder elements */
        if (i == getPartitions() - 1) {
            props[i].setProperty("numItems",
                    String.valueOf(partItems + remItems));
        } else {
            props[i].setProperty("numItems",
                    String.valueOf(partItems));
        }
    }
    return props;
}

The BillReader artifact obtains the partition parameters as follows:

@Dependent
@Named("BillReader")
public class BillReader implements ItemReader {

    @Inject @BatchProperty(name = "firstItem")
    private String firstItemValue;
    @Inject @BatchProperty(name = "numItems")
    private String numItemsValue;
    private ItemNumberCheckpoint checkpoint;
    @PersistenceContext
    private EntityManager em;
    private Iterator iterator;

    @Override
    public void open(Serializable ckpt) throws Exception {
        /* Get the range of items to work on in this partition */
        long firstItem0 = Long.parseLong(firstItemValue);
        long numItems0 = Long.parseLong(numItemsValue);

        if (ckpt == null) {
            /* Create a checkpoint object for this partition */
            checkpoint = new ItemNumberCheckpoint();
            checkpoint.setItemNumber(firstItem0);
            checkpoint.setNumItems(numItems0);
        } else {
            checkpoint = (ItemNumberCheckpoint) ckpt;
        }

        /* Adjust range for this partition from the checkpoint */
        long firstItem = checkpoint.getItemNumber();
        long numItems = numItems0 - (firstItem - firstItem0);
        ...
    }
    ...
}

This artifact also obtains an iterator to read items from the Jakarta Persistence entity manager:

/* Obtain an iterator for the bills in this partition */
String query = "SELECT b FROM PhoneBill b ORDER BY b.phoneNumber";
Query q = em.createQuery(query).setFirstResult((int) firstItem)
        .setMaxResults((int) numItems);
iterator = q.getResultList().iterator();

The BillProcessor artifact iterates over the list of call records in a bill and calculates the tax and total amount due for each bill.

The BillWriter artifact writes each bill to a plain text file.

The Jakarta Faces Pages

The index.xhtml page contains a text area that shows the log file of call records. The page provides a button for the user to submit the batch job and navigate to the next page:

<body>
    <h1>The Phone Billing Example Application</h1>
    <h2>Log file</h2>
    <p>The batch job analyzes the following log file:</p>
    <textarea cols="90" rows="25"
              readonly="true">#{jsfBean.createAndShowLog()}</textarea>
    <p> </p>
    <h:form>
        <h:commandButton value="Start Batch Job"
                         action="#{jsfBean.startBatchJob()}" />
    </h:form>
</body>

This page calls the methods of the managed bean to show the log file and submit the batch job.

The jobstarted.xhtml page provides a button to check the current status of the batch job and displays the bills when the job finishes:

<p>Current Status of the Job: <b>#{jsfBean.jobStatus}</b></p>
<h:dataTable var="_row" value="#{jsfBean.rowList}"
             border="1" rendered="#{jsfBean.completed}">
    <!-- ... show results from jsfBean.rowList ... -->
</h:dataTable>
<!-- Render the check status button if the job has not finished -->
<h:form>
    <h:commandButton value="Check Status"
                     rendered="#{jsfBean.completed==false}"
                     action="jobstarted" />
</h:form>

The Managed Bean

The JsfBean managed bean submits the job to the batch runtime, checks on the status of the job, and reads the text files for each bill.

The startBatchJob method of the bean submits the job to the batch runtime:

/* Submit the batch job to the batch runtime.
 * JSF Navigation method (return the name of the next page) */
public String startBatchJob() {
    jobOperator = BatchRuntime.getJobOperator();
    execID = jobOperator.start("phonebilling", null);
    return "jobstarted";
}

The getJobStatus method of the bean checks the status of the job:

/* Get the status of the job from the batch runtime */
public String getJobStatus() {
    return jobOperator.getJobExecution(execID).getBatchStatus().toString();
}

The getRowList method of the bean creates a list of bills to be displayed on the jobstarted.xhtml faces page using a table.

Running the phonebilling Example Application

You can use either NetBeans IDE or Maven to build, package, deploy, and run the phonebilling example application.

To Run the phonebilling Example Application Using NetBeans IDE

  1. Make sure that GlassFish Server has been started (see Starting and Stopping GlassFish Server).

  2. From the File menu, choose Open Project.

  3. In the Open Project dialog box, navigate to:

    jakartaee-examples/tutorial/batch
  4. Select the phonebilling folder.

  5. Click Open Project.

  6. In the Projects tab, right-click the phonebilling project and select Run.

    This command builds and packages the application into a WAR file, phonebilling.war, located in the target/ directory; deploys it to the server; and launches a web browser window at the following URL:

    http://localhost:8080/phonebilling/

To Run the phonebilling Example Application Using Maven

  1. Make sure that GlassFish Server has been started (see Starting and Stopping GlassFish Server).

  2. In a terminal window, go to:

    jakartaee-examples/tutorial/batch/phonebilling/
  3. Enter the following command to deploy the application:

    mvn install
  4. Open a web browser window at the following URL:

    http://localhost:8080/phonebilling/

Further Information about Batch Processing

For more information on batch processing in Jakarta EE, see Jakarta Batch:
https://jakarta.ee/specifications/batch/2.0/