Skip to content Skip to sidebar Skip to footer

Xpath Does Not Work With This Site, Pls Verify

I using Python with selenium (PhantomJS webdriver) to parse websites and i have problem with it. I want to get current song from this radio site: http://www.eskago.pl/radio/eska-wa

Solution 1:

The xpath you provided is a very fragile one, now wonder you get a NoSuchElementException exception.

Instead, rely on the a tag's class name, there is a current playing song inside:

<a class="playlist_small" href="http://www.eskago.pl/radio/eska-warszawa?noreload=yes">
    <img style="width:41px;" src="http://t-eska.cdn.smcloud.net/common/l/Q/s/lQ2009158Xvbl.jpg/ru-0-ra-45,45-n-lQ2009158Xvbl_jessie_j_bang_bang.jpg" alt="">
    <strong>Jessie J, Ariana Grande, Nicki Minaj</strong>
    <span>Bang Bang</span>
</a>

Here's the sample code:

element = driver.find_element_by_xpath('//a[@class="playlist_small"]/strong')
print element.text

Well, another way to retrieve the current playing song - is to mimic the JSONP response the website is making for the playlist:

>>> import requests
>>> import json
>>> import re
>>> response = requests.get('http://static.eska.pl/m/playlist/channel-999.jsonp')
>>> json_data = re.match('jsonp\((.*?)\);', response.content).group(1)
>>> songs = json.loads(json_data)
>>> current_song = songs[0]
>>> [artist['name'] for artist in current_song['artists']]
[u'David Guetta', u'Showtek', u'Vassy']
>>> current_song['name']
u'Bad'

Solution 2:

As alecxe mentioned, that xpath is going to break if there are any changes in the structure of the page.

A much simpler xpath expression that will work is this: //li[2]/a[2]


Post a Comment for "Xpath Does Not Work With This Site, Pls Verify"