• Home
  • Architecture
  • Tracking data in complex Java code: A functional programming approach, part II

Tracking data in complex Java code: A functional programming approach, part II

Warning!  This article includes the word “monad,” which has strange effects on programmers.  While I reference the idea, you do not need to “get” monads for the article to make sense.  It can be understood simply as a fancy use of functional programming.

In my last article, I described how our real-world application needed to extract warnings from deep inside a complex call tree.  I showed three different approaches to this problem and discussed their pros and cons.  I wasn’t completely satisfied with any of them but promised that I had found an acceptable solution.  That solution?  Functional programming and monads.

I’m not going to try to explain monads here.  I didn’t really “get” monads myself until I had used Scala’s Option type for several months.  I find that monads are best shown or used rather than told.  So let’s jump back to the ConversionResult<T> type from the last article. As described in the previous article, the ConversionResult<T> type is a container class which contains both the result of some conversion operation and a list of warnings which were generated during the conversion. This means, for instance, that if you have a function which should return a Widget and which can also produce warnings about features which we do not yet support on Widgets, the return type would be ConversionResult<Widget>. You can access the generated Widget by calling the get function on the returned value, and you get the warnings by calling the getWarnings function.

Rather than showing the full object definition, I feel it is more useful to show how the ConversionResult type is used in actual code. In this article, I break down the methods on ConversionResult by categories, show how they are used, and discuss why they are needed.

Creating ConversionResults and warnings

You create a new ConversionResult object, with no warnings, using

ConversionResult<String> title = new ConversionResult<>(“Title”);
//title now contains the string "Title" and an empty list of warnings

ConversionResult objects can also be created with warnings:

ConversionResult<String> warnTitle = new ConversionResult<>(“Title”, warning1, warning2, ...);
//warnTitle contaings the string "Title" and a list containing warning1 and warning2

To add new warnings to an existing ConversionResult object, you use the addWarning or addWarnings functions:


title.addWarning(warning3);
title.addWarnings(warningsList);  //warningsList is of type List<String>

Map and flatMap

Creating ConversionResults and adding warnings is the easy, boring part of the API.  The really interesting bits come when you need to transform or combine ConversionResults.  If you have a value wrapped up inside a ConversionResult and need to transform it into something else, you can use map:

ConversionResult<String> title = new ConversionResult<>(“Title”, warning1);
ConversionResult<Integer> length = title.map(_title -> _title.length());
//length now contains the length of title as its value (e.g. 5) and also contains warning1.

Note: we have used the Java lambda function syntax to include an inline anonymous function.

This is useful only if the function you map over can’t produce warnings. If the function you call inside map can produce warnings, then it should return a ConversionResult; if you naively applied map in this case, you would get an object of type ConversionResult<ConversionResult<T>>. This is never what you want. Instead, you should use flatMap:


ConversionResult<String> title2 = new ConversionResult<>(“Title”, warning1);
//title2 is a ConversionResult with one warning
ConversionResult<Integer> length2 = title2.flatMap(_title -> new ConversionResult<>(_title.length(), warning2));
//length2 is a ConversionResult with a value of 5 and which contains both warning1 and warning2.

When you have to mutate data: forEach and forBoth

Because our pre-existing Java codebase was extremely mutable, we were also forced to create the in-place mutation function forEach:


ConsersionResult<Composite> c = new ConversionResult<>(new Composite);
c.forEach(_c -> _c.mutate());
//c.mutate changes the c in place and returns a list of warnings
//After the call to forEach, c has mutated and also contains all the warnings returned by the mutate function

It is also very common in our code to simultaneously unwrap two ConversionResults in order to combine them in place. For instance, before our refactoring, it was common to have code of the form:

Composite c = new Composite();
Piece p = new Piece();
c.piece = p;

Because creating the composite and the piece might both produce warnings, they both need to be replaced with ConversionResults. Because we are adding the piece into the composite, it makes the most sense to add the warnings from p onto c. If we did this all explicitly, it would take several tedious lines of code and require pulling the warnings out of p with a call to getWarnings. In the spirit of forEach and flatMap, we instead invented the forBoth function:


ConversionResult<Composite> c = new ConversionResult<>(new Composite(), warning1);
ConversionResult<Piece> p = new ConversionResult<>(new Piece(), warning2);
c.forBoth(p, (_c, _p) -> _c.piece = _p);
//the value contained in c has been composed with the value contained in p
//c now contains both warning1 and warning2

And that’s enough to do everything our code did before we decided to include messages.  We did, however, create a convenience function which we call mapBoth, which simplifies a common but complex pattern:


ConversionResult<Composite> c = new ConversionResult<>(new Composite(), warning1);
//Note that c.generatePiece return an object of type ConversionResult<Piece>.  We will assume it returns with warning2.
ConversionResult<Param> a = new ConversionResult<>(new Param(), warning3);

ConversionResult<Piece> p1 = c.flatMap( _c -> a.map( _a -> _c.generatePiece(_a) ) );
//p1 contains the result of the generatePiece function call, and also contains all three warnings
ConversionResult<Piece> p2 = c.mapBoth(a,  (_c, _a)  -> _c.generatePiece(_a) ) ;
//p2 is identical to p1, but doesn't need nested function calls

Exercise for experts or people who really want to understand monads: Can you rewrite mapBoth as a Scala for-comprehension or a Haskell do-comprehension?

Access the components: You have been warned

Lastly, we do sometimes need to access the value and warnings stored inside a ConversionResult.

ConversionResult val = new ConversionResult<>("string", warning1, warning2);
val.get() == "string" //true
val.getWarnings() instanceof List //true; it is list containing warning1 and warning2

These access functions should be used with great care. The get function should be used primarily to test the contained value. You can use get to modify the contained value only if no warnings can occur during the modification. The getWarnings function should be used only outside the conversion code when no more warnings are possible.

With the class ConversionResult defined with this functionality, I was finally comfortable refactoring our Java code to use it. As expected, this framework allowed the IDE to do most of the work in finding where message passing code was necessary. It was tedious but mechanical, and most of all, it was reliable. If our refactored code compiled, we had filled in most of the gaps. And the only parts of the code that needed to see the messages were the code that wrote them and the code that read them.

Gotchas

There were a couple of gotchas we discovered in our code base. The biggest was that, unfortunately, the type ConversionResult<Object> is also an instance of Object, and so the compiler didn’t spot the places where we needed to change Object into ConversionType<Object>. It’s debatable whether this is a flaw in our ConversionResult approach or a flaw in the Java language type system, but it is a noteworthy exception to the otherwise stellar type safety ConversionResult gives us. We also had a similar problem with toString–there were places in the code where we called toString on a ConversionResult, rather than on the contained object. We “solved” this by overwriting toString on ConversionResult to equal the toString on the contained object and log a warning. However, this solution feels incomplete.

And it is still possible, even if far less likely, to drop warnings on the floor when we don’t intend to.

Conclusions

Overall, this has been a remarkable experience.  It is very satisfying to add warnings deep inside our conversion code, fix the return types until the code compiles, and then watch the warnings get reported to our tracking system.  I have learned that getting the compiler (and my IDE) to do much of the work for me has made what could have been a very tedious process a merely somewhat tedious process.  Reporting more warnings and making the parsing results go deeper has been both fast and satisfying.  This example shows just how much has been discovered in programming and language design, and it leaves me hopeful for many more improvements in the years to come.

2 Comments

  1. Thank you for the articles. I enjoyed reading them! 🙂

    Some feedback:
    – In article 1, section 2, you are using scala notation for the function parameters.
    – Since you didn’t provide the source for the full ConversionResults it would have been nice to see the function signatures for map, flatMap, forBoth and forEach. Reading function signatures is a vital part of functional programming in my opinion.

  2. Thanks for the feedback.

    I’ve fixed Article 1, section 2 to use Java notation for the parameters.

    As for the function signatures, well, the article was getting too long. But I can include the type declarations here.

    public class ConversionResult<T> {

    ...

        public <R> ConversionResult<R> map(Function<T, R> f) {
            ....
        }

        public <R> ConversionResult<R> flatMap(Function<T, ConversionResult<R>> f) {
            ....
        }

        public void forEach(Function<T, List<ConversionWarning>> f) {
            ....
        }

        public <A> void forBoth(ConversionResult<A> r, BiConsumer<T, A> f) {
            ....
        }
    ...
    }

Your email address will not be published.