Working with CDATA in esql (WMB/IIB)
1. Embedding CData in an xml using esql
---------------------------------------------------
In a CData section, you can include XML markup in the value of an element.
However,
- non-valid XML characters cannot be included.
- Binary data also cannot be included in a CData section.
The most common use for CData is to embed one XML document within another.
You can even embed a badly-formed XML document in this way, because the XML parser does not attempt to parse the content of a CData section.
The following items are not valid within a CData section:
- Non-valid XML characters (see http://www.w3.org/TR/2006/REC-xml-20060816/#charsets)
- The text string ']]>' (because this terminates the CData section)
Sample input
/*
<Orders>
<Order>
<OrderDetails>
<OrderID>1234</OrderID>
<NumOfItems>2</NumOfItems>
<TotalAmount>3000</TotalAmount>
<OrderDate>07/13/2017</OrderDate>
</OrderDetails>
</Order>
</Orders>
*/
/* adding the input to Environment tree
CREATE FIELD Environment.Var;
SET Environment.Var = InputRoot.XMLNSC;
/** XML to embade under OrderDetails as ItemInformation type CData
<items>
<item>
<Name>4G Smart phone</Name>
<Brand>Motorola</Brand>
<Price>24999</Price>
<Color>black</Color>
</item>
<item>
<Name>Personal Laptop</Name>
<Brand>Lenovo</Brand>
<Price>26500</Price>
<Color>black</Color>
</item>
</items>
*/
SET OutputRoot.XMLNSC.Orders= Environment.Var.Orders;
DECLARE rOutRef REFERENCE TO OutputRoot.XMLNSC.Orders.Order;
DECLARE wholeMsgBlob BLOB ASBITSTREAM(InputRoot.XMLNSC, InputRoot.Properties.Encoding,
InputRoot.Properties.CodedCharSetId );
DECLARE wholeMsgChar CHAR CAST(wholeMsgBlob AS CHAR CCSID InputRoot.Properties.CodedCharSetId);
SET rOutRef.OrderDetails.(XMLNSC.CDataField)ItemInformation = wholeMsgChar;
Output:
<Orders>
<Order>
<OrderDetails>
<OrderID>1234</OrderID>
<NumOfItems>2</NumOfItems>
<TotalAmount>3000</TotalAmount>
<OrderDate>07/13/2017</OrderDate>
<ItemInformation><![CDATA[<items> <item> <Name>4G Smart phone</Name> <Brand>Motorola</Brand> <Price>24999</Price> <Color>black</Color> </item> <item> <Name>Personal Laptop</Name> <Brand>Lenovo</Brand> <Price>26500</Price> <Color>black</Color> </item> </items>]]></ItemInformation>
</OrderDetails>
</Order>
</Orders>
2. Extracting CData from an xml using esql
------------------------------------------------------
Input
-----
<Orders>
<Order>
<OrderDetails>
<OrderID>1234</OrderID>
<NumOfItems>2</NumOfItems>
<TotalAmount>3000</TotalAmount>
<OrderDate>07/13/2017</OrderDate>
<ItemInformation><![CDATA[<items> <item> <Name>4G Smart phone</Name> <Brand>Motorola</Brand> <Price>24999</Price> <Color>black</Color> </item> <item> <Name>Personal Laptop</Name> <Brand>Lenovo</Brand> <Price>26500</Price> <Color>black</Color> </item> </items>]]></ItemInformation>
</OrderDetails>
</Order>
</Orders>
DECLARE rOutRef REFERENCE TO OutputRoot.XMLNSC.Orders.Order.OrderDetails;
--Two ways to fetch the data-
--1.
--CREATE LASTCHILD OF Environment DOMAIN 'XMLNSC' PARSE(rOutRef.(XMLNSC.CDataField)ItemInformation,InputProperties.Encoding, InputProperties.CodedCharSetId);
--2.
CREATE LASTCHILD OF Environment DOMAIN 'XMLNSC' PARSE(rOutRef.ItemInformation,InputProperties.Encoding, InputProperties.CodedCharSetId);
SET rOutRef.ItemInformation = NULL;
SET rOutRef.ItemDetails = Environment.XMLNSC;
DELETE FIELD Environment.XMLNSC;
Output:
<Orders>
<Order>
<OrderDetails>
<OrderID>1234</OrderID>
<NumOfItems>2</NumOfItems>
<TotalAmount>3000</TotalAmount>
<OrderDate>07/13/2017</OrderDate>
<ItemDetails>
<items>
<item>
<Name>4G Smart phone</Name>
<Brand>Motorola</Brand>
<Price>24999</Price>
<Color>black</Color>
</item>
<item>
<Name>Personal Laptop</Name>
<Brand>Lenovo</Brand>
<Price>26500</Price>
<Color>black</Color>
</item>
</items>
</ItemDetails>
</OrderDetails>
</Order>
</Orders>
--------------
Worst case scenario:
if there are multiple CDATA tags appended ---
Input:
<Orders>
<Order>
<OrderDetails>
<OrderID>1234</OrderID>
<NumOfItems>2</NumOfItems>
<TotalAmount>3000</TotalAmount>
<OrderDate>07/13/2017</OrderDate>
<ItemInformation><![CDATA[<![CDATA[<items> <item> <Name>4G Smart phone</Name> <Brand>Motorola</Brand> <Price>24999</Price> <Color>black</Color> </item> <item> <Name>Personal Laptop</Name> <Brand>Lenovo</Brand> <Price>26500</Price> <Color>black</Color> </item> </items>]]>]]></ItemInformation>
</OrderDetails>
</Order>
</Orders>
DECLARE rOutRef REFERENCE TO OutputRoot.XMLNSC.Orders.Order.OrderDetails;
DECLARE CData CHARACTER CAST(rOutRef.ItemInformation AS CHARACTER);
DECLARE isCData BOOLEAN TRUE;
--/** one way
WHILE isCData DO
IF CONTAINS(CData,'<![CDATA[') THEN
SET CData = SUBSTRING(CData AFTER '<![CDATA[');
ELSEIF CONTAINS(CData,']]>') THEN
SET CData = SUBSTRING(CData BEFORE ']]>');
ELSE
SET isCData = FALSE;
END IF;
END WHILE;
--*/
--CREATE LASTCHILD OF Environment DOMAIN 'XMLNSC' PARSE(rOutRef.(XMLNSC.CDataField)ItemInformation,InputProperties.Encoding, InputProperties.CodedCharSetId);
--CREATE LASTCHILD OF Environment DOMAIN 'XMLNSC' PARSE(rOutRef.ItemInformation,InputProperties.Encoding, InputProperties.CodedCharSetId);
CREATE LASTCHILD OF Environment DOMAIN 'XMLNSC' PARSE(CData,InputProperties.Encoding, InputProperties.CodedCharSetId);
DECLARE cfieldType CHARACTER FIELDTYPE(rOutRef.ItemInformation);
Declare cDataType CHARACTER XMLNSC.CDataField;
SET rOutRef.ItemInformation = NULL;
SET rOutRef.ItemDetails = Environment.XMLNSC;
DELETE FIELD Environment.XMLNSC;
Output:
<Orders>
<Order>
<OrderDetails>
<OrderID>1234</OrderID>
<NumOfItems>2</NumOfItems>
<TotalAmount>3000</TotalAmount>
<OrderDate>07/13/2017</OrderDate>
<ItemDetails>
<items>
<item>
<Name>4G Smart phone</Name>
<Brand>Motorola</Brand>
<Price>24999</Price>
<Color>black</Color>
</item>
<item>
<Name>Personal Laptop</Name>
<Brand>Lenovo</Brand>
<Price>26500</Price>
<Color>black</Color>
</item>
</items>
</ItemDetails>
</OrderDetails>
</Order>
</Orders>