Description of problem submitted to IBM:
Configuration:
We configured our integration servers to send logging data to ElasticSearch using Logstash. We used basic configuration based on following article:
https://www.ibm.com/docs/en/app-connect/11.0.0?topic=pmwm-configuring-integration-servers-send-logging-data-logstash-in-elk-stack
For the Logstash input protocol we used the Beats option.
Context:
In our message flow nodes we use a monitoring feature (Monitoring tab on its Properties view) in order to log events to ElasticSearch. Our input messages are in XML format (as attached in this error report) and contain multiple elements with the same name (e.g. Bid_TimeSeries) which is perfectly fine according to XML specification. Actually that is the way XML defines items in an array.
Problem:
Based on our server configuration, event logs are transported to ElasticSearch (using Logstash). We are able to find it in the index, however the message does not contain the all "repeated" items/elements (e.g. Bid_TimeSeries) as originally defined in the XML message. We assume that XML messages from message flow are transformed to JSON format which is not valid based on JSON specification and thus not correctly logged in ElasticSearch. Please note the message in ElasticSearch does not contain all data elements (Bid_TimeSeries nr.1 and nr.2) as in the original XML message. JSON message is also attached in this error report as discovered in ElasticSearch.
„1. solution IBM“
We had further discussion with development team regarding the process of converting XMLNSC tree to JSON tree. When we perform this, what internally happening is that the XMLNSC tree is being copied to a generic tree and then this generic tree is being serialized as JSON using XPath.
This will not construct a tree which would set the repeating structure/element array types; instead we will get a copy from XMLNSC to JSON where the elements would get presented as standard, name and nameElement pairs.
Copying an XML array to valid JSON array would require scanning through all the children at the tree level and repeatedly call the child element if they are identical and this will slow down tree copy or JSON writer. So the solution opted for was to mark the JSON array as a JSON array and it is up to the user to do that when they create the JSON message.
So you will have to write logic to convert their messages to valid JSON message. Can you try the JSON approach mentioned in earlier update to construct the elements as valid JSON array elements and confirm it helps to get the expected result.
You can refer the documentation link https://www.ibm.com/docs/en/app-connect/12.0?topic=domain-creating-json-message for converting the message.
„Escalation on our part“
While I appreciate the detailed explanation of the technical process and the challenges involved, I must emphasize that our expectation is for IBM to provide a fully functional and efficient solution.
It is concerning to learn that the current conversion process does not effectively handle repeating XML structures or element arrays (which is very common in XML messaging), resulting in an incorrect JSON representation. The suggested workaround, which requires additional logic on our end to correctly handle repeating XML elements, shifts a significant burden on us.
Given the critical nature of this functionality in our operations, I kindly ask to revisit this issue and provide a solution that correctly transforms XML into valid JSON arrays without extra development work on our side. A solution that both maintains performance and ensures data integrity is essential.
„2. solution IBM“
From the update, we understand that you are looking for a solution from IBM that correctly transforms XML into
valid JSON arrays without requiring additional development on your side.
As discussed with our development team and as explained in the earlier update, there are lot of technical challenges involved in it.
Would you please raise an RFE to address / get this functionality added to the product.
https://www.ibm.com/support/pages/how-create-and-manage-enhancement-requests-ibm-rfe-community
Since RFE doesn't have any case open for it, I wonder if we should close this case? Have you opened a RFE yet?
Idea Review. Thank you for taking the time to lay out in detail the usecase concerning the suggestion for an improvement in the way we send JSON monitoring messages to ELK which have embedded content in XML. We appreciate the desire for an "automated" way to convert the content of repeating XML elements into a JSON array... and we know that some of the non-functional concerns have already been shared with you when considering a future behaviour change in this area ... For the benefit of other readers, an implementation in this area would need the ability to "read ahead" to the next sibling/siblings, determine their name and then use this to decide if the parent item should be a standard object or an array. This will be incredibly costly for scenarios such as the one proposed. Our preference therefore would be to take from the user ahead of time, some indication (potentially built upon metadata already at the flow developer's disposal such as a JSON schema or XML schema) of which fields (identified by name) should be treated as objects and which as arrays ... This could still be applied through configuration and without the need for a flow developer to drop into "code", but would require a detailed user interaction design. Other options to meet the use case and also avoid the Nonfunctional issues discussed, could be to provide specific expressions navigating to repeating element arrays as part of the monitoring event definition. Having noted these implementation concerns, and potential designs for how this could be improved in future, we are moving the status of the suggestion to Future Consideration and are keen to monitor the idea for signs of support from the community.