logo Red Star Development

Thursday, March 22, 2018

XB4J: the cadastre case

It must have been some late Friday afternoon, end of August 2017. I do not remember the exact date. I sit next to Mando, the consultant we hired to strengthen our JAXB knowledge. I watch as he skillfully steps through the JAXB source code with his debugger. We’ve been investigating how we can work around this bug all afternoon. Several ideas have emerged, but not one has convinced us to be the right one. Actually, I no longer believe that JAXB is the right solution for this project. Not with the conditions we have set: no code duplication. Not in the generated Java classes, nor in the Java code that performs the transformations.

My faith is in a free fall this week. At the beginning of November, the Cadastre will stop delivering their SOAP message ‘Kik-inzage version 4.7’. We have only two months remaining to adapt our software and use their new version 5.1. Only two more months! We have been working on this task for the past two months and are making too little progress; we made a technological choice (JAXB versus XSLT), and started the implementation. That’s it. And now we run into that blocking bug in JAXB. If we do not succeed by early November, the Municipal Social Services, our customers, will not receive any cadastral data from us that they need for the assessment of applications for social security benefits. It would damage our company’s reputation.

I sigh and get another round of coffee. When everyone has their drink, I return to my own desk. I mumble to Mando that I want to explore some idea. In reality, I want to continue with my proof of concept that I started a few days ago. For the Kik-inzage message, but with a different technique: XB4J. It’s a small, open source library that, like JAXB, is meant to convert XML to Java and vice versa. I developed it myself a few years ago. It specializes in solving the problem, where JAXB is hitting us with a nasty bug instead. I know it can be done with XB4J, but I also know that the library needs to be expanded, so that it supports the complexity of the Kik-inzage message. And up till now, I was not ready to devote my free time to this. But I see no more alternatives.

It does not take long before I am in a flow. In Eclipse I am editing the Java class where I tell XB4J how the Java representation links to the XML representation, the so-called ‘binding model’. We do not use much of the information that we receive from Cadastre. With the Ignore binding I tell this to XB4J. For the proof of concept I use the Ignore binding for each XML entity, in order to have the message structure quickly modeled. After that, I can refine the model and, one by one, associate the XML entities we need with their Java representations. It is donkeywork and I do understand the charm of JAXB. For the three Kik-inzage operations we use, JAXB generates within three seconds the more than 1500 classes that it thinks it needs. It generates a lot of classes multiple times, because of the way in which the Cadastre uses XML namespaces. Due to the bug in JAXB, it is not possible to reliably reduce the duplicates in all constructions used by the Cadastre. As a result, data is lost during the umarshalling of a message, without showing an error message.

I think that with XB4J we only need a fistful of Java classes to record the information we need from the Cadastre. XB4J does not generate them, as a developer you have to write them yourself. For the proof of concept, I want to write the NaturalPerson class and link it to the two XML representations that occur in the first operation of the Kik-inzage message. They also occur twice in the two other operations, but I reuse the same NaturalPerson class. We have to tell the binding model that these Java classes must also be linked to similar or the same XML elements in other XML namespaces. It would be nice if there was tooling that supported the creation of the binding model. A tool that would read the WSDL or XML schema and offer a GUI that allows you to visually link the XML elements with the Java classes and attributes. That would make working with XB4J a lot easier. But that tooling still has to be written.

The next Monday, on the last day of the sprint, I can present the proof of concept to the team. I have worked at the weekend to adjust XB4J, so the construction with unbounded choices in the Kik-inzage message is now supported. Through an automated test I can show that an XML message is converted to a Java representation and back to XML. No data is lost, as is the case with JAXB. My two fellow developers are critical, but enthusiastic. Mando also spent the weekend in a useful way and has found a workaround for the problem with JAXB. His approach intervenes in the unmarshall process: it alters the namespace of problematic XML elements into the namespace variant that does not pose a problem, just before JAXB converts the elements to Java instances. It requires capturing in the code all namespaces and element combinations that give problems.

In the afternoon we discuss the pros and cons of JAXB and XB4J with the whole team, and we state in which technique we have the most confidence to implement the new Kik-inzage message. After a lively discussion, everyone agrees that XB4J is the best choice. The decisive factor is given by the following arguments:

XB4J decouples the Java representation from the XML schema, making re-use of the Java code (and everything depending on it, such as transformations, etc.) easier to realize. It therefore seems a more robust solution, because in future versions of the Kik-inzage message, the schema changes between versions can be taken care of in the binding model.

  1. The team expects that XB4J has a much lower learning curve than JAXB.

  2. The confidence in JAXB has fallen: who knows what more surprises are awaiting us?

We are going to work with XB4J in the coming sprints. My team mates write the binding models and the Java classes, while I implement their improvement proposals in XB4J and solutions for the problems they encounter. The product backlog is re-estimated and a ‘minimal viable product’ is defined based on information from end users. Writing the binding models gives extra work compared to JAXB, but they can be implemented at a predictable pace. From the Cadastre comes the message that phasing out version 4.7 is postponed. At least until December, and lateron they state that they will communicate a final end date at the end of January 2018. It seems that we are not the only consumer of the message that has difficulty integrating version 5.1 in its systems. At the end of September, we estimated that we will have fully adjusted our software to the new message version between mid-January and mid-February 2018. An estimate that, as it turns out, needs no adjustment. This is well before Friday, April 13, the date that the Cadastre has choosen as the final end date for version 4.7.

Epilogue

The source code of XB4J is on Github. The project can be included as a dependency in your project via Maven Central. The experiences in the team were mostly positive. Developers who have never worked with XB4J were quickly productive. It was important that they could follow an example. The error message that XB4J gives, when the binding model does not correctly describe the message structure is cryptic. Solving these types of mistakes, takes time. Logging makes it quickly clear what the problem is, but the error message should be more helpful. And writing a binding model remains a painstaking job. Following the application of XB4J in this project, I have set the following priorities for it’s further development:

  1. Add user documentation with examples.

  2. Work on tooling to relieve the developer when writing a binding model.

  3. Improve the error message when the bindings do not match the message structure.

  4. Implement all functionality possible according to the XML-schema specifications.

  5. Improve the API, eg by using builders or a DSL.

Let me know what your experiences are with XB4J.