python 2.7 - Trying to extract from the deep node with scrapy, results are bad -
as beginner i'm having hard time, i'm here ask help. i'm trying extract prices html page, nested deeply:
second price location:
from scrapy.spider import spider scrapy.selector import selector mymarket.items import mymarketitem class myspider(spider): name = "mymarket" allowed_domains = ["url"] start_urls = [ "http://url" ] def parse(self, response): sel = selector(response) titles = sel.xpath('//table[@class="tab_product_list"]//tr') items = [] t in titles: item = mymarketitem() item["price"] = t.xpath('//tr//span[2]/text()').extract() items.append(item) return items
i'm trying export scraped prices csv. export being populated this:
and want them sorted in .csv:
etc.
can point out faulty part of xpath or how can make prices sorted "properly" ?
it's difficult what's wrong path. install firepath
extension firefox test xpath
queries. 1 note now:
titles = sel.xpath('//table[@class="tab_product_list"]//tr')
in screenshot have nested tables, //tr
give tr
s nested tables too.
def parse(self, response): sel = selector(response) titles = sel.xpath('//table[@class="tab_product_list"]/tr') # or tbody items = [] t in titles: item = mymarketitem() item["price"] = t.xpath('.//span[@style="color:red;"]/text()').extract()[0] items.append(item) return items
Comments
Post a Comment