# Import XML files - advanced settings

Import XML files using some advanced settings in Productsup.

## What is an XML file?

XML is a versatile file format made up of nodes, which are in a tree structure at different depths. A node is a key found between the left and the right arrow:

<items>

Here is an example of a typical XML file:

<items>
<product>
<title>Yellow Shirt</title>
<size>Medium</size>
<price>20 EUR</price>
<old_price>30 EUR</old_price>
</product>
<product>
<title>Blue Shirt</title>
<size>Large</size>
<price>25 EUR</price>
</product>
</items>

You can import your raw XML files via the Feed URL or the local upload data source options.

As it is so versatile, the computer first needs to understand it, which it does by parsing it. To make sure it parses the data into the platform correctly, you can specify XML settings in the data source setup.

## Root nodes on an XML File

A root node in this context is the desired starting point for importing your products. Productsup always scans for a root node automatically. If the parser doesn’t detect the right node, you need to insert it manually.

To add these settings, navigate to Content Options and XML Settings. You can then add the root node. For more information, see Import a file from a URL.

### An example of a ‘standard use case’ root node

In the standard use case, all child data belonging to the product comes under the root node.

In the following example, the root node is items. This would import everything that comes under the items node, creating one product for each product node.

<items>
<product>
<title>Yellow Shirt</title>
<size>Medium</size>
<price>20 EUR</price>
<old_price>30 EUR</old_price>
</product>
<product>
<title>Blue Shirt</title>
<size>Large</size>
<price>25 EUR</price>
</product>
</items>

### An example of a ‘non-standard use case’ root node

It could be the case that your XML file contains nodes that do not directly correlate with products. The platform should skip such root nodes.

<items>
<name>Example name</name>
<product>
<title>Yellow Shirt</title>
<size>Medium</size>
<price>20 EUR</price>
<old_price>30 EUR</old_price>
</product>
<product>
<title>Blue Shirt</title>
<size>Large</size>
<price>25 EUR</price>
</product>
</items>

If you set the root node as items here, you would also import the name node, which is unnecessary. You can explicitly set it so that the platform imports only the exact product node. To do this, you should set the root node as product!.

The exclamation point tells the parser to import only the exact root node you input.

### An example of an ‘inheritance use case’ root node

Sometimes you may have an XML feed that has variations of the same product in it. In such a case, you may wish to assign common values to each variant.

<items>
<product_list>
<name>products UK</name>
<title>Red Shirt</title>
<price>20 GBP</price>
<products lang="en">
<product>
<size>Medium</size>
</product>
<product>
<size>Extra Large</size>
</product>
</products>
<product_list>

<product_list>
<name>products US</name>
<title>Blue Shirt</title>
<price>30 USD</price>
<products lang="en">
<product>
<size>Small</size>
</product>
</products>
<products lang="es">
<product>
<size>Extra Small</size>
</product>
</products>
<product_list>

</items>

In this example, you want to import two products (red shirt and blue shirt) in four (4) different sizes.

Here, you should enter the root node as items>product_list>products. The arrows inform the parser of the tree structure, going from least to most granular.

The resulting import data is similar to this:

size

product_@attributes_lang

product_list_name

product_list_title

product_list_price

Medium

en

products UK

Red Shirt

20 GBP

Extra Large

en

products UK

Red Shirt

20 GBP

Small

en

products US

Blue Shirt

30 USD

Extra Small

es

products US

Blue Shirt

30 USD

### Use tags in the root node

Nodes can sometimes have a value inside the node itself. These values are tag attributes.

The tag attribute in this example is en:

<products lang="en">

The platform imports tag attributes with an at-sign (@).

For example, if you want to import only the Spanish product from the inheritance example, set the rood node as products lang=es.

In this case, the parser imports only those items that have this attribute inside the tag.

The resulting import data is similar to the following:

size

product_@attributes_lang

product_list_name

product_list_title

product_list_price

Extra Small

es

products US

Blue Shirt

30 USD

### Use a sequence in the root node

You can import a certain instance of nodes by using a sequence.

For example, if you want to import only the US products from the inheritance example, set the root node as products #2. The parser searches for the second occurrence of the products node and imports only what it finds there.

The resulting import data is similar to the following:

size

product_@attributes_lang

product_list_name

products_list_title

product_list_price

Small

en

products US

Blue Shirt

30 USD

Extra Small

es

products US

Blue Shirt

30 USD

## Max depth scanned for a root node

You can define the depth to which the parser should go when searching for the root node. The platform doesn't scan everything deeper than the maximum depth. Adding this information could optimize your processing time.

To add these settings, navigate to Content Options and XML Settings. You can then input the max depth in Max Depth.

### Note

An XML file always starts at a depth of 0.

In the following example, the root node consists of the <products> node and is at a depth of three.

<items>                                     // depth 0
<product_list>                          // depth 1
<name>products UK</name>            // depth 2
<title>Red Shirt</title>
<price>20 GBP</price>
<products lang="en">
<product>                       // depth 3
<size>Medium</size>         // depth 4
</product>
<product>
<size>Extra Large</size>
</product>
</products>
<product_list>

...

## Bundle repeating nodes into columns

If a node appears multiple times under a product, the platform imports them as separate columns. If you don't want it to, you can bundle repeating nodes into one column based on how many times they occur.

To add in these settings, navigate to Content Options and XML Settings. You can then input the threshold in Bundle repeating nodes. You can also choose the delimiter in Bundle delimiter (the standard is a comma ,).

In this XML example, we have multiple color variations per size.

<items>
<product>
<title>
<size>Medium</size>
<color>Yellow</color>
<color>Red</color>
<color>Blue</color>
<color>Green</color>
</product>
</item>

By default, the platform imports the feed similar to this:

title

size

color_1

color_2

color_3

color_4

T-Shirt

Medium

Yellow

Red

Blue

Green

If you enter 4 into Bundle repeating nodes and set : as the bundle delimiter, the platform bundles every attribute appearing at least four (4) times and separates them with a colon (:). It looks similar to this:

title

size

color

T-Shirt

Medium

Yellow:Red:Blue:Green

## XML declaration

The XML declaration is a processing instruction that identifies the document as being an XML. All XML documents should begin with an XML declaration, such as:

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>

If your declaration is missing, you can input this in the Prepend Header Row field.

You should navigate to Content Options and XML Settings to do so.

### Replacing false declarations

If your declaration is incorrect, you can replace it in the Replace Xml Declaration field.

You should navigate to Content Options and XML Settings to do so.

## Repair broken data in your XML

You can fix some issues with broken data directly in the Productsup platform.

You should navigate to Content Options and XML Settings to do so.

Repair control characters

You may have broken UTF-8 control characters in your data source. If this is the case, you can tick Repair control characters to ensure the parser does not break because of this.

Remove DTD

You may have a Document Type Declaration (DTD) in your feed. This is a line that normally comes directly after the XML declaration.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE note SYSTEM "Note.dtd">

If this is the case, you can tick Remove DTD to ensure the parser does not break because of this.

Repair parent node

You may have a parent node that is not closed properly or is incomplete. If this is the case, you can tick Repair parent node to ensure the parser does not break because of this.

Allow empty node

You may have nodes that are empty, but you still want the columns to be imported into the platform. This would then import the node as the column name and leave the value blank for the products. If this is the case, you can tick Allow empty node to import them.

<items>
<product>
<title>
<size>Medium</size>
<color>Yellow</color>
<placeholder></placeholder>
</product>
</item>

Once you enable Allow empty node, the imported data for this example looks similar to this:

title

size

color

placeholder

T-Shirt

Medium

Yellow:Red:Blue:Green

## XSL Transformations

Sometimes you may need an XSLT (Extensible Stylesheet Language Transformations) to have the product data from your XML file represented in the way you wish.

Contact support@productsup.com to enquire about having an XSLT created for you.

For data sources imported via the Feed URL or the local upload data source options, you can also create an XSLT and add it yourself: