Michael Peppler
Sybase Consulting
Sybase on Linux
Install Guide for Sybase on Linux
General Sybase Resources
General Perl Resources
BCP Tool
Bug Tracker
Mailing List Archive
Downloads Directory
Sybase on Linux FAQ
Sybperl FAQ
Michael Peppler's resume

sybperl-l Archive

Up    Prev    Next    

From: Jamie Thingelstad <jjt at mail dot ConcertoTech dot com>
Subject: Re: Performance Question
Date: Feb 18 1998 11:05PM

Michael Peppler stated:
# Jamie Thingelstad wrote:
# > 
# > I've just completed writing a data loader using Sybase::DBlib and want
# > to get more performance out of it. It does an INSERT, UPDATE then a
# > stored procedure for each line, there are >25,000 lines. This
# > currently takes 40mins to run and I believe the bottleneck is
# > Sybase::DBlib or the method I am calling it with since CPU on the
# > client is low, and the server doesn't have much load at all while this
# > runs.
# Do you mean that the CPU usage on the client is high?

CPU usage is low on both systems. Database hovers one processor around
10% and client is around 10% as well. (This entire thing is done in
one $dbh to alleviate database congestion and to avoid the Sybperl
memory leak :P ).

# Are you retrieving a lot of rows from the server when you do this, or
# is this purely an insert operation?

The loader handles all of that ahead of time. It loads a little over
40MB of data into memory to use while processing. This process takes
about 2 minutes. While it loops through the file it does the INSERT,
UPDATE and the stored procedure (a extrardanarily small UPDATE) for
each row. For less than 1% of the rows there is some more complex
processing done, but this is all if'd out for the other 99%.

# Do you do a lot of processing of the data before you send it
# to Sybase?

I do some lc(), uc() and a quick RTrim on it... the data is read using
a while loop with a split (this is definitely row at a time, but I
don't suspect I/O as being a problem here, besides, doesn't Perl do
the large reads for you?)

# Have you checked the CPU usage on the server?
# Are the indexes, etc. correct on the tables that you load?

Indexes are happy, the tables are all freshly defragmented with brand
new indexes. Each table that gets the insert has about 1.4M rows in
it. CPU usage on server is light. I/O is light as well (RAID system
with RAM cache, I/O waits are essentially zilch).

# > The way I am currently doing this is building each statment in a
# > scalar then using dbcmd, dbsqlexec, dbresults, dbcancel.
# > 
# > Couple of questions:
# > 
# > * Is there a significant performance improvement if I put all commands
# > in the scalar then loop through Sybase::DBlib executing each one?
# Probably not - you will gain a little by calling fewer subs in the
# Sybase::DBlib module, but I doubt that that is where you are spending
# your time.

What I'm trying to figure out is that when I do just the INSERT and
the SP call, the process takes 25mins, after adding the update it
takes 40min. The code is no more/less complex, just added another
database call. When looking at Sybase' performance issuing the UPDATE,
it is deferred and uses the index, thus very very fast individually.

So I was left thinking that it is likely the transmission of the
commands, etc, which I don't know if that could be sped up?

# > 
# > * Is there a performance advantage to using Sybase::CTlib over
# > Sybase::DBlib?
# Not that I know of - they are approximately identical, at least
# for OC 11.x.

Is there any way to call and forget the database call? In effect, make
it asynchronous. It would seem to me that what may be happening is the
program is spending a lot of time waiting for the OK from the DB
server? I'm grasping... :)

Jamie Thingelstad, Sr. VP & CIO (Fax) 612.338.0069 (Voice) 612.338.0049x24
Concerto Technologies   123 North 3rd St. Suite 300, Minneapolis, MN 55401