Michael Peppler
Sybase Consulting
Sybase on Linux
Install Guide for Sybase on Linux
General Sybase Resources
General Perl Resources
BCP Tool
Bug Tracker
Mailing List Archive
Downloads Directory
Sybase on Linux FAQ
Sybperl FAQ
Michael Peppler's resume

sybperl-l Archive

Up    Prev    Next    

From: "David LANDGREN" <dlandgre at bpinet dot com>
Subject: Re: Reading Web Pages
Date: Jun 1 2001 12:46PM

|I'm wanting to read some Web page content and stuff it into a server
|prices/yields etc) for longer term analysis.
|Does anyone have any code samples doing something similar?

Are you getting stuck on something specifically Sybase, or on something
upstream? If the latter, the following should get you started.

use LWP::UserAgent;
use HTTP::Request;

my $url = '';
my $ua  = LWP::UserAgent->new;
my $req = HTTP::Request->new(GET => $url);
my $res = $ua->request($req);

die "Error: $url: @{[$res->status_line]}\n" unless $res->is_success;

>From here, the contents of the page is available in $res->content.

The canonical way of proceding from here is to use either HTML::Parser or
HTML::TokeParser. Do *NOT* attempt to hack the result with regular
expressions; you will only drive yourself crazy and people will make fun of
you afterwards. I have a preference for HTML::Parser. Learn it, and it will
serve you well.

Once you have extracted the data you're interested in it is a trivial
matter to insert it into a database.