|
|
sybperl-l Archive
Up Prev Next
From: "Scott Zetlan" <scottzetlan at aol dot com>
Subject: Re: Switch SEPARATOR between different BCP runs
Date: Oct 31 2003 1:32PM
When I was porting BCP.pm to BLK.pm, I experimented with removing that
/o optimisation. I discovered that on my architecture (puny Sun Ultra
5) it made no difference whatsoever. Since it caused no harm, I left it
in, following the example in BCP.pm (which uses DB-Library calls instead
of CT-Library calls).
Anyone have any empirical evidence of an advantage to leaving the /o in
place? If not, I suggest it be removed from the module entirely.
Scott
Michael Peppler wrote on 10/30/2003, 7:52 PM:
> On Thu, 2003-10-30 at 13:29, Michael Peppler wrote:
> > On Thu, 2003-10-30 at 11:33, Lin, Arthur wrote:
> >
> > >
> > > For the second BCP run I set
> > >
> > >
> $GP_BLK->config(
> >
> > > FIELDS => 3,
> > >
> > > BATCH_SIZE => 6000,
> > >
> > > SEPARATOR => '\t');
> > >
> > >
> > >
> > > die "\tBCP in $ListTable failed\n" unless ( $GP_BLK->run ==
> > > $RowCount );
> > > unlink $ListFile, "$ListFile.err";
> > >
> > > But it fails because it does not take '\t' as a separator.
> > >
> > > Am I doing something wrong here to reset the separator ?
> >
> > No - I think that you've hit a bug in the BLK module. I have to admit
> > that I don't use it myself, and I wrote the original code many years
> > ago...
>
> Right - here's the problem:
>
> The _readln() and _readln_meta() subroutines in BLK.pm, which read a
> line of data from the bcp file, and splits it based on the separator,
> use a regular expression with the /o switch. This means - compile this
> regular expression once, which is an optimization that works really
> well, as long as a single program only uses one type of separator. But
> it means that when the separator changes perl doesn't realize this, and
> of course things break.
>
> For now you can fix the problem by changing the
> @d = split(/$sep/o, $ln, -1);
> line to
> @d = split(/$sep/, $ln, -1);
> which will force the regular expression to be re-evaluated for each
> line. I haven't benchmarked this to see what this costs in terms of
> performance hit.
>
> BTW - this is the same problem as bug id 410 in the sybperl bug database
> (http://www.peppler.org/cgi-bin/bug.cgi?__state=2&id=410)
>
> Michael
> --
> Michael Peppler Data Migrations, Inc.
> mpeppler@peppler.org http://www.mbay.net/~mpeppler
> Sybase T-SQL/OpenClient/OpenServer/C/Perl developer available for
> short or
> long term contract positions - http://www.mbay.net/~mpeppler/resume.html
>
|