sybperl-l Archive

From: "David LANDGREN" <dlandgre at bpinet dot com>
Subject: Re: Reading Web Pages
Date: Jun 1 2001 12:46PM

|I'm wanting to read some Web page content and stuff it into a server
|prices/yields etc) for longer term analysis.
|Does anyone have any code samples doing something similar?

Are you getting stuck on something specifically Sybase, or on something
upstream? If the latter, the following should get you started.

use LWP::UserAgent;
use HTTP::Request;

my $url = '';
my $ua  = LWP::UserAgent->new;
my $req = HTTP::Request->new(GET => $url);
my $res = $ua->request($req);

die "Error: $url: @{[$res->status_line]}\n" unless $res->is_success;

>From here, the contents of the page is available in $res->content.

The canonical way of proceding from here is to use either HTML::Parser or
HTML::TokeParser. Do *NOT* attempt to hack the result with regular
expressions; you will only drive yourself crazy and people will make fun of
you afterwards. I have a preference for HTML::Parser. Learn it, and it will
serve you well.

Once you have extracted the data you're interested in it is a trivial
matter to insert it into a database.