|
|
sybperl-l Archive
Up Prev Next
From: Michael Peppler <mpeppler at peppler dot org>
Subject: Re: Switch SEPARATOR between different BCP runs
Date: Oct 31 2003 12:52AM
On Thu, 2003-10-30 at 13:29, Michael Peppler wrote:
> On Thu, 2003-10-30 at 11:33, Lin, Arthur wrote:
>
> >
> > For the second BCP run I set
> >
> > $GP_BLK->config(
>
> > FIELDS => 3,
> >
> > BATCH_SIZE => 6000,
> >
> > SEPARATOR => '\t');
> >
> >
> >
> > die "\tBCP in $ListTable failed\n" unless ( $GP_BLK->run ==
> > $RowCount );
> > unlink $ListFile, "$ListFile.err";
> >
> > But it fails because it does not take '\t' as a separator.
> >
> > Am I doing something wrong here to reset the separator ?
>
> No - I think that you've hit a bug in the BLK module. I have to admit
> that I don't use it myself, and I wrote the original code many years
> ago...
Right - here's the problem:
The _readln() and _readln_meta() subroutines in BLK.pm, which read a
line of data from the bcp file, and splits it based on the separator,
use a regular expression with the /o switch. This means - compile this
regular expression once, which is an optimization that works really
well, as long as a single program only uses one type of separator. But
it means that when the separator changes perl doesn't realize this, and
of course things break.
For now you can fix the problem by changing the
@d = split(/$sep/o, $ln, -1);
line to
@d = split(/$sep/, $ln, -1);
which will force the regular expression to be re-evaluated for each
line. I haven't benchmarked this to see what this costs in terms of
performance hit.
BTW - this is the same problem as bug id 410 in the sybperl bug database
(http://www.peppler.org/cgi-bin/bug.cgi?__state=2&id=410)
Michael
--
Michael Peppler Data Migrations, Inc.
mpeppler@peppler.org http://www.mbay.net/~mpeppler
Sybase T-SQL/OpenClient/OpenServer/C/Perl developer available for short or
long term contract positions - http://www.mbay.net/~mpeppler/resume.html
|