PEPPLER.ORG
Michael Peppler
Sybase Consulting
Menu
Home
Sybase on Linux
Install Guide for Sybase on Linux
General Sybase Resources
General Perl Resources
Freeware
Sybperl
Sybase::Simple
DBD::Sybase
BCP Tool
Bug Tracker
Mailing List Archive
Downloads Directory
FAQs
Sybase on Linux FAQ
Sybperl FAQ
Personal
Michael Peppler's resume

sybperl-l Archive

Up    Prev    Next    

From: mpeppler at itf dot ch (Michael Peppler)
Subject: Re: BCP module
Date: Feb 19 1996 3:47PM

> From: soup@ampersand.com (Doug Campbell)
> ] Date: Fri, 16 Feb 96 15:48:00 +0100
> ] From: mpeppler@itf.CH (Michael Peppler)
> ] 
> ] One of the problems is speed. My initial implementation is approx 4x
> ] slower than pure bcp (though I was running perl and the Sybase server
> ] on the same machine, so cpu contention was high). But if you need to
> ] munge the data before sending it into the dataserver then the speed
> ] issue is less important - you still have process the datafile, which
> ] will take time.
> 
> I've written and been maintaining for 4 years some sybperl software
> that parses ASCII formatted input files and does BCP loads from them.
> I've found that Sybase can always suck in BCP much faster that I can
> parse the input and feed it.  In fact, I can't seem to get the perl
> parsing to go any better than about 4-6 times slower than the Sybase
> BCP.  The parsing involves field rearranging, field validity checking,
> and callbacks for computed values, and other stuff.  Sound familiar?

I've run some tests, and I can now affirm that the slowdown is in the
Sybase bcp_sendrow() routine. The Perl parsing isn't so bad at all
(changing the call to bcp_sendrow() to a "print >/dev/null" improves
the speed by a factor of 2.5)

I've made a small change to bcp_sendrow in DBlib which should improve
the speed a little bit, and I'm now down to a slowdown of approx 3.3
when processing a 10000 row / 11 cols file.

Callbacks are processed very efficiently, and I was not able to measure
a *real* difference when using them or not...


The current implementation looks like this:

#!/usr/local/bin/perl

use Sybase::BCP;
require 'sybutil.pl';   # for the standard sybperl error handlers

$bcp = new Sybase::BCP sa, undef, TROLL;

$bcp->config(INPUT => '../../Sybperl/xab',
	     OUTPUT => 'excalibur.dbo.t3',
	     BATCH_SIZE => 200,			# default 100
	     FIELDS => 4,			# default: number of fields in
	     					# first line of input file.
	     REORDER => {1 => 'account',	# Reorder input columns.
			 3 => 'date',		# we specify table cols by name
			 2 => 'seq_no',		# Any columns not mentioned
			 11 => 'broker'},	# are skipped.
	     CALLBACK => \&check_line,		# This will be called after
	     					# reordering the columns.
	     DATE => 'CTIME',			# Dates are to be converted from
	     					# unix localtime format.
	     					# We look up the column data
	     					# types in the system tables to 
	     					# figure out which columns 
	     					# need to get converted.
	     SEPARATOR => '|');

$bcp->run;	# do it :-)

# This is the callback that can veto the insertion of a line from the
# input file.
sub check_line {
    my $d_ref = shift;

# I only want to insert records where the first field starts with CIS
    return undef if($$d_ref[0] !~ /^CIS/);

    1;
}
	
__END__

You can also specify a callback for each field.

Note that at the moment, Sybase::BCP requires sybperl 2.04 which you
guys ain't got yet... (but I can probably make it work with 2.03 and
perl 5.001m as well, in case there are some of you who wish to try it
out...)

Comments are welcome!

Michael