Need to parse an XML file in go? XML (en|de)coding is build into go’s standard library. Basic parsing of nodes and embedded nodes is straightforward, but there are some interesting things to think about when you need to parse attributes, lists, and if you don’t want to create new structs all over the place.
For this exercise, let’s look at the following XML document:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Defining a Data Structure
The simplest node to parse is going to be <name>
. We’ll need an object to unmarshal the XML into, which we’ll call Person
. Every field on Person
will be evaluated by the XML encoder to be populated based on the field’s name. However, our struct field names and node names don’t usually correspond that simply. Because of this, we’ll use xml
struct tags to identify how to map each node to a go field. Here’s an example for <name>
:
1 2 3 |
|
The next node contains a list of <address>
nodes. For an address node, we’ll have to create a similar struct with City
and Street
fields:
1 2 3 4 |
|
While parsing each Address
, we also want to find the type
of address, which can be found as an attribute on the <address>
node. By adding attr
to our XML struct tag, we can properly parse the type
field as well:
1 2 3 4 5 |
|
A Person
has a list of Address
es. Since we know that address
is a direct descendant, we can use the >
keyword to parse each <address>
from <addresses>
into an embedded struct on our Person
struct.
1 2 3 4 |
|
This code will work to parse the original document, but do we really need to define a formal struct for addresses? If there was only one address, we could put all the fields directly on the Person
struct. However, since we’re dealing with a list, our best option is to use an anonymous struct:
1 2 3 4 5 6 7 8 |
|
Binding the Data Structure
We can use the encoding/xml package to decode the given XML. Given that our raw XML document is stored in a []byte
called document
, we’ll use xml.Unmarshal
to bind our data structure to the XML document:
1 2 |
|
Final Code
Let’s put it all together, including a main
function that will use fmt.Println
to print the results of binding the data structure to the XML document. Unmarshal
could return an error, but we’ll ignore it in this case.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|
I’ve posted this code as a gist in the go playground so you can see it in action.