the large file challenge
Yennie at aol.com
Yennie at aol.com
Sun Nov 10 19:33:01 EST 2002
All right... I tweaked a little more outside of email.
For accuracy in the case where "mystic_mouse" occurs multiple times on one
line, uncomment the line:
"add offset(return, thisChunk, theOffset) to theOffset"
This just skips to the next line whenever a match is found.
This should run faster than my previous attempts:
on startup
## initialize variables: try adjusting numLines
put "/gig/tmp/log/access_log" into the_file
put ($1*1024*1024) into chunkSize ## this is for MB
put 0 into counter
put FALSE into isEOF
open file the_file
repeat until (isEOF = TRUE)
## read the specified number of lines, check if we are at the end of the
file
read from file the_file for chunkSize
put it into thisChunk
put (the result = "eof") into isEOF
## count the number of matches in this chunk
put offset("mystic_mouse", thisChunk) into theOffset
repeat
add 1 to counter
get offset("mystic_mouse", thisChunk, theOffset)
if (it = 0) then exit repeat
put theOffset + it + 12 into theOffset
## add offset(return, thisChunk, theOffset) to theOffset
end repeat
end repeat
close file the_file
put counter
end startup
HTH.
Brian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.runrev.com/pipermail/metacard/attachments/20021110/e74d0467/attachment.htm
More information about the metacard
mailing list