Xpath Does Not Work With This Site, Pls Verify
I using Python with selenium (PhantomJS webdriver) to parse websites and i have problem with it. I want to get current song from this radio site: http://www.eskago.pl/radio/eska-wa
Solution 1:
The xpath you provided is a very fragile one, now wonder you get a NoSuchElementException
exception.
Instead, rely on the a
tag's class name, there is a current playing song inside:
<a class="playlist_small" href="http://www.eskago.pl/radio/eska-warszawa?noreload=yes">
<img style="width:41px;" src="http://t-eska.cdn.smcloud.net/common/l/Q/s/lQ2009158Xvbl.jpg/ru-0-ra-45,45-n-lQ2009158Xvbl_jessie_j_bang_bang.jpg" alt="">
<strong>Jessie J, Ariana Grande, Nicki Minaj</strong>
<span>Bang Bang</span>
</a>
Here's the sample code:
element = driver.find_element_by_xpath('//a[@class="playlist_small"]/strong')
print element.text
Well, another way to retrieve the current playing song - is to mimic the JSONP response the website is making for the playlist:
>>> import requests
>>> import json
>>> import re
>>> response = requests.get('http://static.eska.pl/m/playlist/channel-999.jsonp')
>>> json_data = re.match('jsonp\((.*?)\);', response.content).group(1)
>>> songs = json.loads(json_data)
>>> current_song = songs[0]
>>> [artist['name'] for artist in current_song['artists']]
[u'David Guetta', u'Showtek', u'Vassy']
>>> current_song['name']
u'Bad'
Solution 2:
As alecxe mentioned, that xpath is going to break if there are any changes in the structure of the page.
A much simpler xpath expression that will work is this: //li[2]/a[2]
Post a Comment for "Xpath Does Not Work With This Site, Pls Verify"