|
|
sybperl-l Archive
Up Prev Next
From: Michael Peppler <mpeppler at peppler dot org>
Subject: RE: out of memory -- CTlib
Date: Feb 8 2001 10:18PM
Cox, Mark writes:
> Thanks for the help and suggestions.
>
> --- The scalar(localtime) change made a surprising difference in processing
> time.
fork/exec is expensive, as you noticed.
>
> --- I ended up increasing the swap size on the Unix box and this solved the
> memory problem for the moment.
>
> --- I am looking at writing it out to a file instead of keeping it all in
> memory but I need to see if this is any better than just reading it directly
> from the database for each record. The main advantage to keeping all of the
> info in memory is the enormous decrease in processing time. Typically 2hrs
> becomes 10 min. I will let you know how it goes.
Consider using a dbm file (or something similar) to manage the
size. This will let you use a hash (although the "value" part of the
hash should probably be a string) and will use a constant amount of
memory.
You could do something like this (off the top of my head - not
guarantee correct!)
use NDBM_File;
my %hash;
tie(%hash, 'NDBM_File', '/tmp/my_dbm_file', O_CREAT|O_RDWR, 0666);
....
while(@data = $dbh->ct_fetch) {
$hash{$data[0]} = join('|', @data);
}
....
You could also use the Storable module (get it from CPAN) to
"serialize" the data in @data to store/retrieve it from the dbm file.
Michael
> -----Original Message-----
> From: Michael Peppler [mailto:mpeppler@peppler.org]
> Sent: Thursday, February 08, 2001 11:15 AM
> To: SybPerl Discussion List
> Subject: Re: out of memory -- CTlib
>
>
> Cox, Mark writes:
> >
> > Any sugestions or help would be welcome.
> >
> > I am using ct_lib to select large look-up tables from the data base for
> feed
> > processing. I tend to assign all of the info in the data base into a
> hash
> > keyed on a specific value in the database and then read the file line by
> > line using the key as a quick lookup. What I am running into however is
> that
> > if I try to read in more than 100,000 records or so I get an 'Out of
> > Memory!' error. Is there a more efficient way to read in a large number
> of
> > records into a hash table? Any help or suggestions would be most
> welcome.
>
> 100,000 records in a hash table is quite a lot. Have you checked with
> ps or top to see how much memory you are using? Do you have
> limit/ulimit set?
>
> I don't see any obvious problems with your code.
> > if (!($y % 10000) && ($y !=0)) {
> > print "$y Records processed at " , `date`;
> > }
>
> You can use scalar(localtime) instead of `date` which will avoid a
> fork()/exec() and should speed things up.
>
> Michael
> --
> Michael Peppler - Data Migrations Inc. - mpeppler@peppler.org
> http://www.mbay.net/~mpeppler - mpeppler@mbay.net
> International Sybase User Group - http://www.isug.com
> Sybase on Linux mailing list: ase-linux-list@isug.com
>
--
Michael Peppler - Data Migrations Inc. - mpeppler@peppler.org
http://www.mbay.net/~mpeppler - mpeppler@mbay.net
International Sybase User Group - http://www.isug.com
Sybase on Linux mailing list: ase-linux-list@isug.com
|