Skip to content Skip to sidebar Skip to footer

How To Obtain Element Values From A KML By Using Lmxl

My problem is very similar to the one found here: How to pull data from KML/XML? The answer to the above question is to use Nokogiri to fix the format. I wonder if there is a way t

Solution 1:

One of the issues you're having is that when you do for x in y you're iterating all children of the current element.

So when you do this:

for Folder in Document:
    ...

you're not just iterating over Folder elements; you're also iterating over name, open, Schema, Style, and StyleMap (excluded the namespace for now).

You could still get what you want by testing the name attribute value and then returning the elements text...

for Document in root:
    for Folder in Document:
        for Placemark in Folder:
            for ExtendedData in Placemark:
                for SchemaData in ExtendedData:
                    for SimpleData in SchemaData:
                        if SimpleData.get("name") == "ID":
                            print(SimpleData.text)

but I would not recommend it.

Instead consider using XPath 1.0 with lxml's xpath() function.

This will allow you to directly target the elements you're interested in.

For this example I'm going to use the full path instead of the // abbreviated syntax. I'll also use a predicate to test the attribute value.

At first glance you would think that the XPath to all of the SimpleData elements with a name attribute value of "ID" would be:

/kml/Document/Folder/Placemark/ExtendedData/SchemaData/SimpleData[@name='ID']

but this is not the case. If you notice there is an xmlns="http://www.opengis.net/kml/2.2" on the root (kml) element. This means that that element and all of its decendant elements are in the default namespace http://www.opengis.net/kml/2.2 (unless declared otherwise on those elements).

To illustrate, if you added a print(f"In Folder element \"{Folder.tag}\"...") to your for Folder in Document loop, you'd see:

In Folder element "{http://www.opengis.net/kml/2.2}name"...
In Folder element "{http://www.opengis.net/kml/2.2}open"...
In Folder element "{http://www.opengis.net/kml/2.2}Schema"...
In Folder element "{http://www.opengis.net/kml/2.2}Style"...
In Folder element "{http://www.opengis.net/kml/2.2}StyleMap"...
In Folder element "{http://www.opengis.net/kml/2.2}Style"...
In Folder element "{http://www.opengis.net/kml/2.2}Folder"...

There are a few ways to handle namespaces in lxml, but I prefer to declare them in a dictionary and pass them with the namespaces argument.

Here's a full example...

from lxml import etree

ns = {"kml": "http://www.opengis.net/kml/2.2"}

tree = etree.parse("test.kml")

for simple_data in tree.xpath("/kml:kml/kml:Document/kml:Folder/kml:Placemark/kml:ExtendedData/kml:SchemaData/kml:SimpleData[@name='ID']", namespaces=ns):
    print(simple_data.text)

Print Output...

FM2
FM3

Solution 2:

For some reason, I ran into problems with xml validity of your kml_file, so I did it this way:

import lxml.html
tree  = lxml.html.fromstring(kml_file)
results = tree.xpath("//*[@name = 'ID']")

for i in results:
    if i.text:
        print(i.text)

I'm not sure this is what you're looking for, but the output is:

FM2
FM3

Post a Comment for "How To Obtain Element Values From A KML By Using Lmxl"