Java Version 8 has brought about big changes for this language. Lambda expressions and streams, which provide the language with functional programming features, stand out among them. But with so many changes, it is easy to miss some details, such as the one we will go over in this post.

According to the Royal Spanish Academy, a collection is “an ordered set of things, usually belonging to the same class and gathered due to their special interest or value.” But what is that which determines what can be of ‘interest’ or ‘value’ to an individual?

From coins to tobacco packets, via ‘do not disturb’ signs, human beings have spent centuries collecting many different objects.

When we talk about our applications, Java provides us with the Collectors class, with which we can collect the data from a stream into a list, a map, etc. But what if we were to need something else? Here is where Java 8’s Collector interface comes into play.

This interface comes in handy when we need to group the data in a very specific manner and process lists concurrently since we will use parallelStream to invoke our customized Collector.

Java 8 gives us the Collectors class, which offers different implementations to group and store the information from a Stream. Each of these implementations comes from the Collector interface.

Therefore, the Collector interface is going to allow us to set the data gathering and grouping rules in a manner that most closely matches our needs.

Since everything is understood better with an example, here is mine: Let us assume we have a list of runners from which we want to obtain the podium of winners. This can be done very easily by implementing our own Collector, which will return a Podium object with the three runners that finished the race in the shortest time.

How do we implement it?

Accumulator

The first thing we need to do is to define our Accumulator by instancing a new output object in its builder. In it we will implement the methods that our customized Collector will subsequently use:

  1. Accumulate method: We will code the necessary logic for subsequently processing the results. In this case, we will first add a penalty to each runner’s time. Next, we will add this runner to the Podium.
  2. Combiner method: In the output data combiner method, we must implement the logic that will decide how the resulting object will be. This method will receive an Accumulator, from which we will have to draw the final object it contains (see the finisher method below) and which we will use to combine it with the final object of the Accumulator itself.

In the exemplary case, we will sort the runners of both Podia by finish time, penalty and, in the case of a tie, number. The final Podium will include the 3 runners with the shortest times and fewer penalties.

  1. Finisher method: The finisher method will return the final object; in this case, the Podium.

public class RunnerAccumulator {
 private Runner firstRunner;
 private Runner secondRunner;
 private Runner thirdRunner;
 public void accumulate(Runner runner) {
 runner.addPenalty();
 decidePositions(runner);
 }
 public RunnerAccumulator combine(RunnerAccumulator other) {
 Podium podium = other.finish();
 podium.getFirstRunner().ifPresent(this::decidePositions);
 podium.getSecondRunner().ifPresent(this::decidePositions);
 podium.getThirdRunner().ifPresent(this::decidePositions);
 return this;
 }
 public Podium finish() {
 return new Podium(firstRunner, secondRunner, thirdRunner);
 }
 private void decidePositions(Runner runner) {
 if (isFasterThan(runner, firstRunner)) {
 setFirstRunner(runner);
 } else if (isFasterThan(runner, secondRunner)) {
 setSecondRunner(runner);
 } else if (isFasterThan(runner, thirdRunner)) {
 thirdRunner = runner;
 }
 }
 private void setFirstRunner(Runner runner) {
 thirdRunner = secondRunner;
 secondRunner = firstRunner;
 firstRunner = runner;
 }
 private void setSecondRunner(Runner runner) {
 thirdRunner = secondRunner;
 secondRunner = runner;
 }
 private static boolean isFasterThan(Runner runner, Runner accumulatorRunner) {
 return (accumulatorRunner == null || runner.compareTo(accumulatorRunner) < 0);
 }
}

Collector interface

We must create our RunnerCollector class, which will be the class that the Collector interface will implement. In it we will enter the following:

  1. As the input object, the type of object from the list that we are going to process (in this case, the Runner object).
  2. As the type of accumulation of the reduction operation, the Accumulator that we will code (in this case, RunnerAccumulator).
  3. As the output object of the reduction operation, the object resulting from the accumulation (in this case, the Podium).

public class RunnerCollector implements Collector<Runner, RunnerAccumulator, Podium> {
 @Override
 public Supplier<RunnerAccumulator> supplier() {
 return () -> new RunnerAccumulator();
 }
 @Override
 public BiConsumer<RunnerAccumulator, Runner> accumulator() {
 return RunnerAccumulator::accumulate;
 }
 @Override
 public BinaryOperator<RunnerAccumulator> combiner() {
 return RunnerAccumulator::combine;
 }
 @Override
 public Function<RunnerAccumulator, Podium> finisher() {
 return RunnerAccumulator::finish;
 }
 @Override
 public Set<Characteristics> characteristics() {
 Set<Characteristics> chars = new HashSet<Collector.Characteristics>();
 chars.add(Characteristics.CONCURRENT);
 return chars;
 }
}

Once we have indicated the types that will take part in the Collector, the next thing we need to do is to implement the methods that are defined by the interface:

  1. supplier: In supplier, we will create the instance of RunnerAccumulator that we implemented before.
  2. accumulator: In accumulator, we will return the accumulate method that we defined in the implemented RunnerAccumulator.
  3. combiner: In combiner, we will return the combine method that we defined in the implemented RunnerAccumulator.
  4. finisher: As in the accumulator and combiner cases, in finisher we will return the finish method that we defined in our RunnerAccumulator.
  5. characteristics: In characteristics, we will enter the characteristics that our Collector will have. The interface provides a list of characteristics. The following characteristics are available:

We have used the CONCURRENT characteristic to indicate that our Collector is a concurrent collector, which means that the results container can accept that the accumulator function be simultaneous called with the same results container of several subprocesses.

Using our Collector

Since our Collector implementation allows concurrence, we will parallelize our runner stream (parallelStream). When using parallel execution, it is very important to implement the combine method (if it is not parallel, it will not be invoked).


public static void main(String[] args) {
 RunnerCollector usersCollector = new RunnerCollector();
 Set<Characteristics> characteristics = usersCollector.characteristics();
 Podium podium = getMockRunners().parallel()
 .collect(Collector.of(usersCollector.supplier(),
 usersCollector.accumulator(),
 usersCollector.combiner(),
 usersCollector.finisher(),
 characteristics.toArray(new Characteristics[characteristics.size()])));
 System.out.println(podium.toString());
 }

The result of this execution is a Podium object with the three winning Runners:

[code light="true"]
1º Runner [dorsal=1, name=Mario, surname=Sanchez, time=300, penalty=2, endTime=302]
2º Runner [dorsal=5, name=Juan, surname=Fernandez, time=308, penalty=0, endTime=308]
3º Runner [dorsal=2, name=Daniel, surname=Jimenez, time=307, penalty=1, endTime=308]
[/code]

Conclusions

As you can see, the Collector interface is very useful to group the data in a Stream in a very easy manner according to our needs.

You can download the full exemplary project here.

Tell us what you think.

Comments are moderated and will only be visible if they add to the discussion in a constructive way. If you disagree with a point, please, be polite.

Subscribe

We are committed.

Technology, people and positive impact.