hyperlink - HTMLUnit collecting all links by class name -
i scrape / collect links on page under specific class name
e.g. html agriculture (92)
<a href="http://www.specificurl/page.html" class="generate">agriculture</a>
i have been toying following pieces of code:
list<?> links = page.getbyxpath("//div[@class='generate']/@href"); or list<?> links = page.getanchors(); system.out.println(links);
the getbyxpath option returns null , other option grabs anchors. there way grab links list?
this terrible xpath having issues narrowing down. (i can better xpath if necessary, 1 worked:
list<?> links = page.getbyxpath("/html/body/div[2]/div[2]/table/tbody/tr/td/table/tbody/tr[7]/td/table/tbody/tr/td/div/table/tbody/tr[2]/td/div/table/tbody/tr/td/table/tbody/tr/td/ul/li/a/@href").aslist()
i'm not quite sure why wasn't allow grab class name.
let me know how works when chance
Comments
Post a Comment