Sitecore PXM Transformation
PXM (Print Experience Manager) is a platform built by Sitecore to provide bridges between Adobe InDesign and Sitecore. PXM provides the ability to create complex personalized PDFs on the fly from structured data of Sitecore CMS. It also provides the same rules engine that is used by Sitecore CMS to automate the generation process. In the past a few months, we were building a solution to allow users to connect to a Sitecore instance in Adobe InDesign and generate InDesign pages from Sitecore fields. In this article, we will discuss how transformation definitions are being used to generate XML.
Sitecore PXM comes with a set of transformation definitions and transformation attributes. These transformations make HTML to XML conversion extremely extensible and configurable for developers.
InDesign connector allows users to connect to their Sitecore instances and create content from Sitecore fields by dragging fields to a page.
When you drag a rich text field of an item from the Sitecore content tree in the InDesign Sitecore plugin to an InDesign page, the code will try to find a matching definition from the transformations folder to process each HTML tag. Transformation definition allows you to add a custom parser for a specific HTML tag. It also allows you to set rules for transforming HTML attributes to XML attributes. For example, if you want PXM to convert:
<p class="some class">Some paragraph</p>
<ParagraphStyle Style="MyStyle"><![CDATA[Some paragraph]]</ParagraphStyle>
You can easily do it without writing any code.
How to Make Your Own Transformation Set
You can also make your own custom transformation set by inserting a new transformation set.
and replacing the default one. (Or you can use default by adding your transformation to Default TransformationSet)
Go to “/sitecore/Print Studio/Modules/InDesign connector/Other Settings/Default settings/Default settings” and set Default transformation to yours. Properly point Type to parser code. If you have custom XML attributes, make sure PublishingEngine.xsd has these attribute definitions.
How to Make a Transformation Definition
- Create transformation definition
Create a TransformationElement under the Definitions folder. The folder path in Sitecore should be: “/sitecore/Print Studio/Modules/InDesign connector/Transformations/Definitions”.
In Parser section, specify parser class and dll. You can use PXM default HTML parser “Sitecore.PrintStudio.PublishingEngine.Text.Parsers.Html.HtmlNodeParser, Sitecore.PrintStudio.PublishingEngine”.
In Element Names section, specify the target XML element in “XML Name” (make sure the element is defined in PublishingEngine.xsd). Specify the HTML element you want to convert in “XHTML Name”.
- Create transformation attributes
Refer to PublishingEngine.xsd for required attributes and create transformation attributes under the transformation element you just created. There are four fields under Attributes section that drive the behavior of transformation. Below is the definition of each field:
- XML Attribute: The name of XML attribute
- XML Default Value: The value of XML attribute. If left blank, HTML value will be carried over.
- XHTML Attribute: The name of HTML attribute you want to convert.
- XHTML Default Value: The value of HTML attribute. If left blank, this field value will be an empty string.
- Add transformation definition to your new transformation set, which is located in the same folder.
When a field is getting pulled, InDesign connector service calls HtmlNodeParser.ParseChildNode() recursively to parse HTML elements. This method gets all parse definitions from the transformation folder, uses the current html node name as a key to find the matching transformation definition’s html tag name. If found, transformation attributes will be used to generate XML attributes based on the settings.
If XML attribute and XML default value are given, the workflow of transformation attribute should be like this:
During the parsing process, the parser is looking for a match between XML attribute and HTML attribute. If HTML attribute is not found in the parsing HTML or attribute is found and the value matches the XHTML default value, default XML attribute and value will be assigned to the generated XML. If an HTML attribute is found but the values don’t match, the HTML attribute value from the parsing text will be carried over to the corresponding XML attribute.
For example, assume XML attribute name is “Style” and XML default value is “xml”, HTML attribute is “class” and default value is “html”. The mapping table below shows how attribute works:
PXM Transformations can help to reduce your custom code complexity and transform Sitecore content into InDesign pages without any custom code. With the power of transformation definition, you should be able to create complex custom parsing rules to generate InDesign XML from HTML in Sitecore.