regex - Querying a website with Perl LWP::Simple to Process Online Prices -

May 15, 2015

in free time, i've been trying improve perl abilities working on script uses lwp::simple poll 1 specific website's product pages check prices of products (i'm of perl noob). script keeps simple backlog of last price seen item (since prices change frequently).

i wondering if there way further automate script don't have explicitly add page's url initial hash (i.e. keep array of key terms , search query amazon find page or price?). there anyway way doesn't involve me copying amazon's search url , parsing in keywords? (i'm aware processing html regex bad form, used since need 1 small piece of data).

 #!usr/bin/perl use strict; use warnings; use lwp::simple;  %oldprice; %nameurl = (     "archer season 1" => "http://rads.stackoverflow.com/amzn/click/b00475b0g2",     "code complete" => "http://rads.stackoverflow.com/amzn/click/0735619670",     "intermediate perl" => "http://rads.stackoverflow.com/amzn/click/0596102062",     "inglorious basterds (2-disc)" => "http://rads.stackoverflow.com/amzn/click/b002t9h2lk" );  if (-e "backlog.txt"){     open (log, "backlog.txt");     while(){         chomp;         @temp = split(/:\s/);         $oldprice{$temp[0]} = $temp[1];     } close(log); }  print "\nchecking daily amazon prices:\n"; open(log, ">backlog.txt"); foreach $key (sort keys %nameurl){     $content = $nameurl{$key} or die;     $content =~  m{\s*\$(\d+.\d+)} || die;     if (exists $oldprice{$key} && $oldprice{$key} != $1){         print "$key: \$$1 (was $oldprice{$key})\n";     }     else{     print "\n$key: $1\n";     }     print log "$key: $1\n"; } close(log);

i made simple script demonstate amazon search automation. search url departments changed escaped search term. rest of code simple parsing html::treebuilder. structure of html in question can examined dump method (see commented-out line).

use strict; use warnings;  use lwp::simple; use uri::escape; use html::treebuilder; use try::tiny;  $look_for = "archer season 1";  $contents   = "http://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3daps&field-keywords="         . uri_escape($look_for);  $html = html::treebuilder->new_from_content($contents); $item ($html->look_down(id => qr/result_\d+/)) {     # $item->dump;      # find out structure of html     $title = try { $item->look_down(class => 'producttitle')->as_trimmed_text };     $price = try { $item->look_down(class => 'newprice')->find('span')->as_text };      print "$title\n$price\n\n"; } $html->delete;

Search This Blog

Assebmley

regex - Querying a website with Perl LWP::Simple to Process Online Prices -

Comments

Post a Comment

Popular posts from this blog

apache - Add omitted ? to URLs -

redirect - bbPress Forum - rewrite to wwww.mysite prohibits login -

php - How can I stop spam on my custom forum/blog? -