the large file challenge

Sadhunathan Nadesan sadhu at castandcrew.com
Sun Nov 10 16:27:01 EST 2002


Ok, here are the results so far,


bash
Sun Nov 10 13:01:59 PST 2002
17333
Sun Nov 10 13:03:43 PST 2002

pascal
Sun Nov 10 13:03:43 PST 2002
17333
Sun Nov 10 13:05:47 PST 2002

andu's metacard
Sun Nov 10 13:05:47 PST 2002
29623
Sun Nov 10 13:08:10 PST 2002

pierre's metacard
Sun Nov 10 13:08:10 PST 2002
17338
Sun Nov 10 13:10:21 PST 2002

bruce's metacard
Sun Nov 10 13:10:21 PST 2002
33351
Sun Nov 10 13:14:59 PST 2002


That would be 

bash	1:44
pascal	2:04
Andu	2:23
Pierre	2:11
Bruce	4:38

Now, it is likely I have become confused and mixed up exactly what came
from who, sorry about that!  My apologies if your name is not associated
with your contribution, or vice versa.

Now, why did we get different counts?  I believe the count of 17333 is
correct.  Maybe someone can debug that.



Here's the code

Andu
---
#!/usr/local/bin/mc

on startup
  put 0 into the_counter
  put 1 into the_offset
  put 333491183 into file_size
  put   30000 into the_increment
  put "/gig/tmp/log/access_log" into the_file
  put "mystic_mouse" into pattern

  open file the_file for read

  repeat until (the_offset >= file_size)
    read from file the_file at the_offset for the_increment
    put it into the_text
    repeat for each line this_line in the_text
      get offset(pattern, this_line)
      if (it is not 0) then add 1 to the_counter
    end repeat
    add the_increment to the_offset
  end repeat

  put the_counter
end startup


Pierre
------
#!/usr/local/bin/mc

on startup
  put 0 into the_counter
  put 1 into the_offset
  put 333491183 into file_size
  put   30000 into the_increment
  put "/gig/tmp/log/access_log" into the_file
  put "mystic_mouse" into pattern

  open file the_file for read

  repeat until (the_offset >= file_size)
    read from file the_file at the_offset for the_increment
    put it into the_text

     repeat until lineoffset("mystic_mouse", the_text) = 0
       if (lineoffset("mystic_mouse", the_text) is not "0") then
         add 1 to the_counter
         delete line 1 to lineoffset("mystic_mouse", the_text) of the_text
       end if
     end repeat

    add the_increment to the_offset
  end repeat

  put the_counter
end startup


Bruce
-----
#!/usr/local/bin/mc
on startup
  ## initialize variables: try adjusting numLines
  put "/gig/tmp/log/access_log" into the_file
  put $1 into numLines  -- called with 10000 as parameter
  put 0 into counter

  open file the_file

  repeat until (isEOF = TRUE)
     ## read the specified number of lines, check if we are at the end of the file
     read from file the_file for numLines lines
     put it into thisChunk
     put (the result = "eof") into isEOF

     ## count the number of matches in this chunk
     put offset("mystic_mouse", thisChunk) into theOffset
     repeat until (theOffset = 0)
        add 1 to counter
        put offset("mystic_mouse", thisChunk, theOffset) into tempOffset
        if (tempOffset > 0) then add tempOffset to theOffset
        else put 0 into theOffset
     end repeat

  end repeat

  close file the_file

  put counter
end startup





More information about the metacard mailing list