\ {{{1 GNU General Public License
{
Program Tops - a stack-based computing environment
Copyright (C) 1999-2005  Dale R. Williamson

Author: Dale R. Williamson <dale.williamson@prodigy.net>

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
1}}} 
}

{  File hash.v  June 2001

   Copyright (c) 2001  D. R. Williamson

   Words for making hash, and words for working with hash
 
\-----------------------------------------------------------------------

\  Top level words for hash.

      hash_make (hKeys hValue nbins qHash --- ) \ making word Hash
         - Creates a word for a hash of given paired keys and values

         - The handle to the hash will be the given quoted name Hash

         - Keys is a volume (type VOL) with no null bytes

         - Values are either matrices or volumes (MAT or VOL)

         - The hash algorithm converts the bit pattern in each row 
           of Keys to a hash index in the range 0 to nbins-1 
 
         - Memory required is proportional to the rows in Keys and
           Value, not the range nbins

      hash_lookup (hHash hKeys --- hV1 hV2 ... hVn n) \ n key vals
         - Returns values that match each key in the rows of the given 
           list of Keys; for n rows in Keys, returns n value items on 
           the stack, with count n topmost (ready for a DO loop or a
           word like pilen)

         - Keys is a volume (type VOL) with no null bytes

         - Vk returned on the stack are either all matrices (MAT) or 
           all volumes (VOL)

         - If row Keys(k) is not found, Vk will have zero rows (it will
           be purged, so beware: if pilen is used, those Vk with zero 
           rows will not appear in the packaged result, which will have
           fewer rows than incoming Keys)

      hash_add (hKeys hVals hHash --- ) \ add Keys:Vals to Hash
         - Adds paired keys and values to existing hash, Hash

      hash? (hHash --- f) \ true if stack item is a hash handle
         - Places flag on stack, true if stack item is a hash handle

\  -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -

   The words in this file are listed below.

   syspath "hash.v" cat asciiload this " inline:" grepr reach dot

   inline: hash (hT bins --- hB) \ hash indices B for rows of T
   inline: hash? (hHash --- f) \ true if stack item is a hash handle
   inline: hash_add (hKeys hVals hHash --- ) \ add Keys:Vals to Hash
   inline: hash_bin_fetch (hHash b --- hA) \ key row numbers in bin b
   inline: hash_bin_keys (hHash b --- hKeys) \ keys that occupy bin b
   inline: hash_bin_vals (hHash b --- hVals) \ values that occupy bin b
   inline: hash_bins (hHash --- nbins) \ number of bins in Hash
   inline: hash_close (hHash --- ) \ free its arrays and close Hash
   inline: hash_Keys (hHash --- hKeys) \ all keys present in Hash
   inline: hash_lookup (hHash hKeys --- hV1 hV2 ... hVn n) \ n key vals
   inline: hash_make (hKeys hValues nbins qHash --- ) \ making word Hash
   inline: hash_not (qS --- ) \ reporting not a hash
   inline: hash_ptr ( --- ptr) \ ptr being used by hash algorithm
   inline: hash_ptr_set (ptr --- ) \ set hash algorthm to use ptr
   inline: hash_set (qWord --- ) \ set hash algorithm to use ptr of Word
   inline: hash_store (hVals hHash hKeys --- ) \ replace Vals in hash
   inline: hash_Vals (hHash --- hVals) \ all values present in Hash
   inline: hash_words ( --- hNames) \ the list of all hash word names
}
\  -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -

   inline: hash (hT bins --- hB) \ hash indices B for rows of T
{     This word starts out using the program's hash algorithm in word 
      _hash.  But that can be overridden by banking here a ptr to 
      another word in place of the ptr to _hash.  Word hash_set was
      written to do this, and is used as: "myhash" hash_set.
}     [ "_hash" ptr "hash_ptr" book ] hash_ptr exe ;

   inline: hash? (hHash --- f) \ true if stack item is a hash handle
      no STR stkok not IF drop no return THEN
      this "__BinsActive" localref exists?
      IF "__nBins" extract any ELSE drop no THEN
   end

   -stkbal
   inline: hash_add (hKeys hVals hHash --- ) \ add Keys:Vals to Hash
{     Adding Keys and Vals to existing Hash.

      If Vals types old and new do not agree, or if matrix columns of
      Vals old and new do not agree, they are both made into type VOL.

      Making MAT into VOL does not change bytes (but piling appends
      blanks--20h bytes--to the smaller in the case of columns not 
      agreeing), but arrays returned from hash_lookup will be type 
      VOL, not MAT.
}
      [ defname is N ] 
      this hash? not IF N hash_not return THEN (hHash) push 

      (hKeys hVals) these hand rows them hand rows <> 
      IF " " N ": rows of Keys and Vals do not match" 
         cat cat ersys pull return 
      THEN

      over (hKeys) no that type MAT = or, swap type NUM = or
      IF " " N ": Keys must be type VOL" cat cat ersys pull return 
      THEN

      (hKeys hVals) swap (hKeys)

      (hKeys) hand peek "__Keys" extract swap (hKold hKnew) pile swap

      (hVals) hand (hVnew) peek "__Vals" extract (hVold) swap 
      (hVold hVnew) that type VOL =
      IF (hVnew) this type MAT = IF bend mat2vol THEN
      ELSE (hVold hVnew) this type MAT =
         IF these cols them cols <>
            IF swap bend mat2vol, swap bend mat2vol THEN \ both into VOL
         ELSE swap bend mat2vol swap \ making old into VOL
         THEN
      THEN (hVold hVnew) pile

      (hKeys hVals) peek "__nBins" extract
      peek hash_close
      pull hash_make
   end
   stkbal

   inline: hash_bin_fetch (hHash b --- hA) \ key row numbers in bin b
{     Returned A is a list of the 0-based row numbers of Keys that 
      occupy bin b.

      Note that incoming bin b is a 0-based index, always less than 
      the number nbins returned by word hash_bins.

      A is purged if bin b is empty.

      Matrix BinsActive in Hash has two columns:
         col 1: sorted bin numbers, 0-based
         col 2: memptr to list of 0-based key row numbers

      Matrix BinsActive serves a purpose similar to linked lists in the 
      program's use of hash for matching text to catalog names.

      In BinsActive the memptr in row k, column 2, points to a list of 
      hash table Keys rows that are common to the bin number given in 
      row k, column 1.
}
      [ defname is N ] that hash? not IF N hash_not return THEN

      this other "__nBins" extract nit no within not
      IF " " N ": bin number out of range" cat cat ersys return THEN

      swap "__BinsActive" extract swap (hB b) hash_bin_fetch1 (hA)
   end

   inline: hash_bin_keys (hHash b --- hKeys) \ keys that occupy bin b
      those hash_Keys rev hash_bin_fetch xbase +d reach ;

   inline: hash_bin_vals (hHash b --- hVals) \ values that occupy bin b
      those hash_Vals rev hash_bin_fetch xbase +d reach ;

   inline: hash_bins (hHash --- nbins) \ number of bins in Hash
      this hash?
      IF "__nBins" extract ELSE [ defname is N ] N hash_not THEN ;

   inline: hash_close (hHash --- ) \ free its arrays and close Hash
      this hash? not IF N hash_not return THEN [ defname is N ]
      this "__BinsActive" localref exists?
      IF (hHash) this "__BinsActive" extract (hA) these rows 1st
         DO (hA) this I 2nd fetch (memptr) memfree LOOP drop
         zero those "__nBins" bank
         purged those "__Vals" bank
         purged those "__Keys" bank
         purged those "__BinsActive" bank
      THEN
    \ Removing this hash name from list kept by word hash_make--yank
    \ and bank:
      (hHash) "hash_make" "HashNames" yank
         (HashNames) these rot (HashNames hHash) grepe (hRow)
         (HashNames hRow) those rows teeth rake lop (HashNames1)
      (HashNames1) "hash_make" "HashNames" bank
   end

   inline: hash_Keys (hHash --- hKeys) \ all keys present in Hash
\     Returned Keys contains all the Keys input to word hash_make.
      [ defname is N ] 
      this hash? IF "__Keys" extract ELSE N hash_not THEN ;

   inline: hash_lookup (hHash hKeys --- hV1 hV2 ... hVn n) \ n key vals
{     For n rows in Keys, look up n sets of corresponding values and
      place each one on the stack.

      Vk will have more than one row if Keys(k) is paired with more than
      one value.

      If Keys(k) is paired with no values, Vk will have zero rows.
}
      [ " " defname cat is N ] that hash? not IF N hash_not return THEN

      (hKeys) no that type MAT = or, that type NUM = or
      IF N ": Keys must be type VOL" cat ersys return THEN

      [ 128 is pad, "number of items will exceed stack depth" is msg ]
      (hKeys) these rows depth plus pad plus, depthSTK >
      IF N ": " msg cat cat ersys return THEN

      (hKeys) hand swap
      (hHash) "Hash" book

      (hKeys) this Hash, hash_bins hash (hKeys hBins)

      Hash "__Keys" extract named ptr
      Hash "__Vals" extract named ptr
      Hash "__BinsActive" extract named ptr

      hash_lookup1
   end

   inline: hash_make (hKeys hValues nbins qHash --- ) \ making word Hash
{     This word creates a hash word of the name given in string Hash.  
      A list of all hash word names is kept here in HashNames.

      A hash word puts a handle on the stack when it runs:
         Hash ( --- hHash)
 
      The library of a hash word contains the following names **:
         __Name - the hash word name
         __Keys - a volume of keys 
         __Vals - a volume or matrix of values, one row for each row 
                  of __Keys
         __nBins - the number of hash bins used for the hash algorithm
         __BinsActive - a two-column matrix needed to access the hash

         ** Normally, a hash word is a standalone word.  But if a hash 
         word is forced into the library of another word (see example 
         in man localref), then beware of conflicts of these names with
         other names the word's library; the leading double underscore 
         should help.  Without special tags on these names, however,
         only one hash word can be forced into another word's library 
         since a second hash word will definitely conflict with these 
         names given to the first one.
}
      [ no no blockofblanks "HashNames" book

      \ This is the text to make a hash word.  The word's name goes 
      \ into the variable called __Name, and whenever the hash word 
      \ runs, it places __Name on the stack; string __Name is con-
      \ sidered to be the hash handle:
           "[ defname '__Name' book ] __Name (qHash)" "hashText" book
      ]
    \ Verifying stack and sizes:
         depth three >, one STR stkok and, two NUM stkok and, not
         IF "hash_make" stknot return THEN

         three pick (hKeys) hand type VOL <>
         IF " hash_make: Keys must be type VOL or STR" ersys return 
         THEN

         three pick hand (hKeys) rows, three pick hand (hVals) rows <>
         IF " hash_make: rows of Keys and Vals do not match" 
            ersys return
         THEN

      (qHash) "Name" book
      (nbins) one max "nBins" book 
      (hKeys) hand "Vals" book 
      (hValues) hand "Keys" book

    \ Making the hash word (with catalog status message off):
         Name ctype INLI = not
         IF CATMSG (f) no catmsg
            hashText Name inlinex
            (f) catmsg
         ELSE Name hash? IF Name hash_close THEN
         THEN

       \ Adding Name to the list HashNames kept here in local lib:
         HashNames this Name grepe reach rows any not
         IF HashNames Name pile onto HashNames THEN

    \ Making hash:
         Keys nBins hash (hBins)

    \ Making Tab: column 1 holds hash bin numbers in ascending order
    \ and column 2 holds corresponding 0-based row numbers of Keys 
    \ (and Vals):
         (hBins) zero those rows items park (hTab) yes sort (hTab)

    \ Matrix Active bins created below by word hash_ActiveBins has two 
    \ columns:
    \    row k, col 1: sorted bin number (unique)
    \    row k, col 2: a memptr to a list of Keys row numbers for bin
         (hTab) hash_ActiveBins "ActiveBins" book

    \ Banking items from here into the library of the hash word and
    \ purging the ones here:
         nBins Name "__nBins" bank
         Vals Name "__Vals" bank, purged are Vals
         Keys Name "__Keys" bank, purged are Keys
         ActiveBins Name "__BinsActive" bank, purged are ActiveBins

{     Why ActiveBins here and BinsActive in the hash word? 

      Because word hash? looks to see if the word in question has an
      array BinsActive in its local library, which all hash words have.

      If this word also had an array called BinsActive, it would be
      mistaken for a hash word in the phrase: "hash_make" hash?.
}  
   end

   inline: hash_not (qS --- ) \ reporting not a hash
      strchop " " nose ": not a handle to hash" cat ersys ;

   inline: hash_ptr ( --- ptr) \ ptr being used by hash algorithm
      "hash" "hash_ptr" yank (ptr) ;

   inline: hash_ptr_set (ptr --- ) \ set hash algorthm to use ptr
      (ptr) "hash" "hash_ptr" bank ;

   inline: hash_set (qWord --- ) \ set hash algorithm to use ptr of Word
\     Word must operate with this stack diagram, same as word hash: 
\        Word (hT bins --- hB) 
      no STR stkok
      IF this exists?  
         IF \ never allow word hash; in such a case, assume _hash:
            (qWord) "hash" that alike IF drop "_hash" THEN

            (qWord) ptr hash_ptr_set return 
         THEN
      THEN
      "hash_set" stknot
   end

   inline: hash_store (hVals hHash hKeys --- ) \ replace Vals in hash
\     For Keys(k), replacing its current Vals in hash with Vals(k).
\     If Keys(k) has multiple entries, the first one receives Vals(k).
\     If Keys(k) is not found in hash, nothing is done with its Vals(k).

      [ defname is N ] that hash? not IF N hash_not return THEN

      (hKeys) no that type MAT = or, that type NUM = or
      IF " " N ": Keys must be type VOL" cat cat ersys return THEN

      them rows (Vals_rows), those rows (Keys_rows) <>
      IF " " N ": rows of Keys and Vals do not match" cat cat 
         ersys return 
      THEN

      them cols (Vals_cols) them hash_Vals cols <>
      IF " " N ": width of hash_Vals and new values do not match" 
         cat cat ersys return 
      THEN

      (hKeys) hand "Keys" book
      (hHash) "Hash" book
      (hVals) "Vals" book

    \ Making list of valid row offsets from hash_Keys to use for a rake:
      list: Keys rows 1st
         DO Hash hash_Keys Keys I quote (hKeys Key.i) grepe any?
            IF ontop \ taking the first one matched
            ELSE -1  \ Key.i not found 
            THEN
         LOOP
      end (hList) dup 0< (hR) this push \ -1 at rows where no key match

    \ Raking out rows that do not match any hash_Keys (-1 in List):
      (hList hR) rake drop (hRows)
      Vals pull (hR) rake drop (hVals) swap

    \ Ramming new values into rows of this hash's hash_Vals:
      (hVals hRows) Hash hash_Vals ram

      purged purged are Keys, are Vals
   end

   inline: hash_Vals (hHash --- hVals) \ all values present in Hash
\     Returned Vals contains all the Vals input to word hash_make.
      [ defname is N ]
      this hash? IF "__Vals" extract ELSE N hash_not THEN ;

   inline: hash_words ( --- hNames) \ the list of all hash word names
\     Names will be an empty list if there are no hash handles.
      "hash_make" "HashNames" yank any?
      IF cop ELSE no no blockofblanks THEN "_HashNames" naming ;

\-----------------------------------------------------------------------

\  Utilities that use hash.
   private, end of words for hash

\-----------------------------------------------------------------------
 
   Examples of hash 

   To run example 1, use:

      "hash.v" "Example 1" msource

   Or copy and drop line by line the text that runs an example.

   Or to study example 3: 

      - Put the following line topmost in file work.v:

           "hash.v" "Example 3" msource halt

      - Temporarily insert halt at the place in example 3 where you 
        want to stop.

      - Save this changed file hash.v (:w in vi).

      - Key ww at the ready prompt and the Example 3 text will run 
        until reaching inserted word halt.  

      - Then see the items on the stack, and move one by one through 
        the words that follow word halt.

   Also see Demonstrations below; they are concerned with workings of 
   the hash mechanism, rather than examples of using hash words.

\  -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -

   Example 1
\  Making hash and adding to it

\  Run this using:
\      "hash.v" "Example 1" msource

   "Example 1: Making hash and adding to it" nl . nl nl

\  Adapted from "Programming Perl," 3rd edition, p.11.

\  Temporarily insert halt at the place you want to stop.

\  Some key, value pattern pairs:
   depth push -strict

      Sun, Sunday
      Mon, Monday
      Tue, Tuesday
      Wed  Wednesday,
      Thu, Thursday
      Fri, Friday
      Sat, Saturday

   strict depth pull less (n) 
   (n) listn words dice (hKeys hVals)

\  Making the hash:
   (hKeys hVals) these rows (bins) "longday" hash_make

\  Making word show_em to display hash lookup results:
\     show_em (A1 A2 ... An n) showing A1, A2, ... An
   CATMSG (f) no catmsg
   inline: show_em this push, revn pull 1st nl DO .m nl LOOP ;
   (f) catmsg

\  Some lookups:
   longday "Tue" hash_lookup show_em
   longday "Fri Wed Sat Mon" words hash_lookup show_em nl

\  More pattern pairs of keys and values:
   depth push

      "April 1st" "April Fools'"
      "July 4th"  "Independence Day"
      "May 5th"   "Cinco de Mayo"
      "April 1st" "April Fools' Day"

   depth pull less (n), nit 1st DO pile LOOP dice (hKeys hVals)

\  More patterns added to hash:
   (hKeys hVals) longday hash_add

\  More lookups:
   longday 
      "Fri" "July 4th" "Wed" "April 1st" pile pile pile 
   hash_lookup 
   show_em

   end of Example 1

\  -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -
 
   Example 2
\  Hash of arrays

\  Run this using:
\      "hash.v" "Example 2" msource

   "Example 2: Hash of arrays" nl . nl nl
  
\  Adapted from "Programming Perl," 3rd edition, p.276.

\  Making some utilities:
      "hash.v" "Utilities for Example 2" msource
   
\  Making some arrays:
      "fred barney wilma dino" words "flintstones" book
      "george jane elroy"      words "jetsons"     book
      "homer marge bart"       words "simpsons"    book

\  Let the hash keys be the names of the arrays without catalog tags:
      flintstones named notag
      jetsons     named notag pile
      simpsons    named notag pile (hKeys)

\  Let the hash values be the ptrs to the arrays:
      list: "flintstones" ptr, "jetsons" ptr, "simpsons" ptr ; (hVals)

\  Making the hash:
      (hKeys hVals) 16 "$HoA" hash_make

\  Accessing the hash and displaying some arrays in the hash:
      $HoA "jetsons newmans flintstones simpsons" words (hT) groups 
      nl dot nl

\  Showing a member:
      $HoA "jetsons" dup nl spaced dot 3rd member "member: " dot dot nl

\  Adding new members to an array that also happens to be in the hash:
      flintstones, "betty pebbles" words pile, "flintstones" book

\  Verifying the hash of arrays sees the new members in flintstones,
\  without ever remaking it:
      $HoA "flintstones" groups "revised " nl dot dot nl

   end of Example 2

   Utilities for Example 2

\  A word to collect arrays, called groups, from Hash:
   inline: groups (hHash hT --- hGroups)
      hand (hT) these reversed (rows are reversed)
      right justify ": " tail "Group" book

      no no blockofblanks push

      (hHash hT) hash_lookup (hVn hVn-1 ... hV1 n) 1st 
      DO Group I quote swap any? 
         IF ontop exe (hT) vol2str (qS)
         ELSE "<<<<< group not found"
         THEN cat pull, pile push
      LOOP pull
   end

\  A word to show the mth member of a group:
   inline: member (hHash qGroup m --- qMem) 
      rev (hHash qGroup) hash_lookup drop any? 
      IF ontop exe (hT) vol2str swap (m) word (0 or qMem -1) not
         IF "" THEN \ returning empty string if Group has no mth word
      ELSE drop ""  \ returning empty string if Group not found 
      THEN (qMem) ;

   end Utilities for Example 2

\  -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -

   Example 3
\  Hash of words 

\  A hash is a word, created by word hash_make, so a hash of hash is 
\  just a subset of a hash of words made here.

\  And since words can be made into ptrs, and ptrs are just numbers 
\  that can go into arrays, an array of hashes is just an array of 
\  words, also made here.

\  Run this using:
\      "hash.v" "Example 3" msource

   "Example 3: Making hash of words" nl . nl nl

\  Adapted from "Programming Perl," 3rd edition, p.242.

\  Temporarily insert halt at the place you want to stop this example
\  and do copy-and-drop.

\  Making some utilities:
   "hash.v" "Utilities for Example 3" msource

\  Making some words.  Each person in this example is defined by a
\  word containing individual vital statistics.  When one of these 
\  person-words runs, it puts its name on the stack (word makeme is
\  the word below that makes person-words--see Utilities for Example
\  3):
      "John" "47  brown  186" makeme
      "Mary" "23, hazel, 128" makeme
      "Bill" "35  blue   157" makeme

\  Note that these words are not the hash--they are simply words that 
\  go into the hash of words to be made.  The hash word will be word 
\  vitals.

\  Picking some hash keys for the words above (keys can be anything
\  unique without embedded nulls, even just the names of the words):
      "JLee MJones WSmith" words (hKeys) \ keys

\  Letting the catalog ptrs of the words be the hash values, by making
\  an array of word ptrs:
      list: "John" ptr "Mary" ptr "Bill" ptr ; (hVals) \ ptr list array

\  Making a hash of these keys and values, a word called vitals:
      (hKeys hVals) 16 (bins) "vitals" hash_make 

\  Fetching some arrays from the hash:
      vitals "WSmith RBrown MJones JLee" words hash_lookup say_who
   
\  Changing some arrays (see this word in Utilities for Example 3):
      Bill_and_Mary_go_out 

\  Viewed from the hash, seeing how Bill and Mary have changed:
      vitals "WSmith MJones" words hash_lookup say_who

   " Wow, these words can really get mean!" nl dot nl

   end of Example 3

   Utilities for Example 3 

   "makeme" exists? IF halt THEN

   inline: makeme (qName qMe --- ) \ creating a word to contain data
\     The purpose of the word that this word creates is to hold data in
\     its local library, available for other words to extract.

\     When run, the word that this word creates simply puts its name on
\     the stack.

    \ Making new word called Name, which will put Iam (its defname) on 
    \ the stack when it runs:
      (qMe) push "[ defname 'Iam' book ] Iam" that (qName) inlinex

    \ Done.  Now banking words from string Me into the new word:
      (qName) peek (qMe) 1st word drop, over "age" bank
      (qName) peek (qMe) 2nd word drop, over "eyes" bank
      (qName) pull (qMe) 3rd word drop, swap "weight" bank
   end

   inline: Hi! (ptr --- ) \ run a ptr to a word created by word makeme
\     Displaying data banked in the word whose name appears on the 
\     stack when ptr is fired.
      [ 68 is wid, 78e7 seedset ]
      flip IF "Hi! " ELSE "Hello," THEN " I'm" spaced cat (qS) swap 
      (qS ptr) exe?
      IF (hName) this push (qS hName) cat 
         ".  My vitals are: " cat
         "age "      peek (hName) "age"    yank cat cat
         ", weight " peek (hName) "weight" yank cat cat
         ", eyes "   pull (hName) "eyes"   yank cat cat "." cat (qS)
      ELSE (qS) "not in your database." cat (qS)
      THEN (qS) sp wid .out
   end

   inline: say_who (Ptr1 Ptr2 ... PtrN N --- )
\     Displaying list of ptrs from hash_lookup.  For Keys(k) not found 
\     in hash_lookup, Ptrk on stack will have zero rows and word any?
\     will return false.

      this push, revn \ putting Ptr1 on top and PtrN on bottom
      pull 1st nl
      DO any? 
         IF 1st pry Hi!
         ELSE false Hi! "  << HEY YOU, GET OUTA HERE!" dot
         THEN nl
      LOOP
   end

   define: Bill_and_Mary_go_out ( --- ) \ revising some data
      " Bill and Mary go out." nl dot nl

    \ Banking revised data in words Mary and Bill:
      "Mary" "age" 2dup yank 
      " (uh huh, yeah sure Mayree, an' dat kid smokin' at the mall" 
      " didn't yell 'hi ya mom')" cat cat rev bank

      "Bill" "eyes" 2dup yank " and bloodshot" cat rev bank 

      "Bill" "age" 2dup yank " (right Bill, and nothin' ta show" 
      " for it--whatta loser)" cat cat rev bank
   end 

   end of Utilities for Example 3
 
\  -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -

   Example 4
{"
 Running the hash problem posed in:
    The Great Computer Language Shootout (2001): 
       http://www.bagley.org/~doug/shootout/
    Shootout tests were run on a dual 450Mhz Pentium-II server with 
    1GB of RAM and 4x9GB SCSI disks. 
"} nl . nl nl
{
   Here is the pseudo code for this hash problem:
      Input is a positive integer, N
      Step 1:
       loop i = 1 to N
         make a string, k, which is the hexadecimal form if i
         insert (k, i) as a key, value) pair into the hash
      Step 2:
       loop i = N to 1 (step of -1)
         make a string, ki, from the integer i
         test: does the hash contain key ki?  if yes, increment counter
      Finish: print the counter

      For N = 200 the code will print 99
}
   memprobe nl
   time push

   80000 is N
   " For N =" . N .i "," .

   1 N items ints "%X" format (key)
   1 N items integer (value) N "%HASH" hash_make
   %HASH 1 N items reversed integer "%0.0f" format (key1)
   (hHash hKey) hash_lookup pilen rows 

   .i " matched" . nl
   time pull " Elapsed seconds:" . less .i nl

   " Memory delta: " . memprobe

   end of Example 4

\-----------------------------------------------------------------------
 
   Demonstrations

   This demonstration plots the distributions in hashes with various
   numbers of bins.  Also shown is the distribution of the words in
   the hash bins of this program's catalog.

   If the hashing algorithm is working well, the distribution of stings
   in the bins should be fairly uniform.

   This demo is run with the phrase:

      "hash.v" "Hash bin distribution demo" msource

   Hash bin distribution demo

   "sticks" missing IF math.v source THEN

   inline: hashx (hT bins --- hB) \ hash indexed to current index base
\     0-based indices from the word hash are adjusted to index base.
      hash xbase one = IF one +d THEN ;

   inline: random_text (r c --- hT) \ block of random text bytes, r-by-c
      [ "32 127" "char_range" inlinex ]
      char_range three roll three roll brandom (hT)
   end

   inline: showbins (hT n_bins --- ) \ show bins of hash
\     Showing the distribution of text T in n_bins using word hash.

      [ {" \ Computing standard deviation of values in vector C:
           (hC --- d) \ d=sqrt{sum[(Ci-mean)^2]/(n-1)}
           (hC) this mean ontop, those rows push 
           -d dup *by totals ontop pull nit slash sqrt (d)
        "} "deviation" inlinex
      ]
      "Bins" book, these rows "Items" book
      nl " Case of " dot, Bins int$ dot, " bins" dot nl 

      " Distribution of" dot, these chars .i, "-byte items in hash bins"
      dot nl

      (hT) Bins hashx (hNums)
      (hNums) 1 ndx, Bins ndx, Bins bins (hBinCounts) "Counts" book

      Items .i, " items in" dot, Bins .i " bins" dot,
      " (expecting " dot, Items Bins slash rounded 1 max .u, " per bin)"
      dot nl

      Counts these rake bob (hC hE)
      " Empty bins: " . (hE) rows int$ .i nl

      " Counts for " dot, (hC) these rows int$ dot
      " active bins (min, mean, max):" dot
      (hC) dup bend stats 3 1st DO dup I pry rounded .i LOOP nl drop
      (hC) deviation " Std deviation:" . rounded .i nl

      Counts 1st those rows items sticks plot
   end

   " Place cursor on point and click mouse to display x, y location" 
   nl dot nl nl 

\  Running some cases:

   1000 40 random_text,  2 showbins pause nl
   1000 40 random_text, 16 showbins pause nl
   1000 40 random_text, 64 showbins pause nl
   10000 40 random_text, 640 showbins pause nl 

   " Distribution of the words in the catalog:" nl dot nl
   catnames notrailing 4096 showbins

   " plotclose closes the plot window" nl dot

   end of hash bin distribution demo 

\  -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -

   Showing all the catalog names and the hash bins where they are 
   located.

   This demo is run with the phrase:

      "hash.v" "Catalog hash bins" msource

   Catalog hash bins

      catnames dup 4096 hash "%6.0f" format cat eview

   end of Hash bin catalog names demo
 
\-----------------------------------------------------------------------
;  Appendix

   Obsolete versions of words.  Some of these have been replaced by
   C functions in hash.c.

  _inline: hash_bin_fetch (hHash b --- hA) \ key row numbers in bin b
{     Returned A is a list of the hash Keys rows from hash_make that
      occupy bin b.

      Note that incoming bin b is a 0-based index, always less than
      nbins returned by word hash_bins.

      A is purged if bin b is empty.

      Matrix BinsActive in Hash has two columns:
         col 1: sorted bin numbers, 0-based
         col 2: memptr to list of key rows

      Matrix BinsActive serves a purpose similar to linked lists in the
      program's use of hash for matching text to catalog names.

      In BinsActive the memptr in row k, column 2, points to a list of
      hash table Keys rows that are common to the bin number given in
      row k, column 1.
}
      [ defname is N ] that hash? not IF N hash_not return THEN

      this other "nBins" extract nit no within not
      IF " " N ": bin number out of range" cat cat ersys return THEN

      push "BinsActive" extract (hB)
      (hB) this pull (hBinNums b) bsearch (hB r f)
      IF (hB r) reach 2nd pry (memptr) mempeek (hA)
      ELSE 2drop purged \ b is empty: not an active bin
      THEN
   end

  _inline: hash_lookup (hHash hKeys --- hV1 hV2 ... hVn n) \ n key vals
\     For n rows in Keys, look up n sets of corresponding values and
\     place each one on the stack.
\     If row k Keys is not found, Vk has no rows (purged).

      [ defname is N ] that hash? not IF N hash_not return THEN

      (hKeys) no that type MAT = or, that type NUM = or
      IF " " N ": Keys must be type VOL" cat cat ersys return THEN

      (hKeys) hand "Keys" book
      (hHash) "Hash" book

      Keys Hash, hash_bins hash, "Bins" book

      depth push, Keys rows 1st
      DO Hash Bins I pry (hHash b) 2dup
         (hHash b) hash_bin_vals (hVals) rev
         (hHash b) hash_bin_keys (hKeys)
         Keys I quote (hKeys Key.i) grepe (hR)
         (hVals hR) reach notrailing (hVal)
         "_V" I suffix naming (hV.i)
      LOOP
      depth pull less
   end

  _inline: hash_makE (hKeys hValues nbins qHash --- ) \ making word Hash
{     This word creates a hash word matching the name given in string
      Hash.  A list of all hash word names is kept here in HashNames.

      A hash word puts a handle on the stack when it runs:
         Hash ( --- hHash)

      The library of a hash word contains the following:
         Name - the hash word name
         Keys - a volume of keys
         Vals - a volume or matrix of values, one row for each Keys row
         nBins - the number of hash bins
         BinsActive - a two-column matrix needed to access the hash
}
      [ no no blockofblanks "HashNames" book

      \ This is the text to make a hash word.  The word's name goes
      \ into the variable called Name, and whenever the hash word runs,
      \ it places Name on the stack; string Name is considered to be
      \ the hash handle:
           "[ defname 'Name' book ] Name (qHash)" "hashText" book

      \ This is a local word to free the memory at the memptrs listed
      \ in col 2 of matrix BinsActive of a hash word being redefined:
        {" (qName --- )
           this hash? not IF drop return THEN
           this "BinsActive" localref exists?
           IF (qName) "BinsActive" extract (hA) these rows 1st
              DO (hA) this I 2nd fetch (memptr) memfree LOOP (hA)
           THEN (hA or qName) drop
        "} "freeActive" inlinex
      ]
      depth three >, one STR stkok and, two NUM stkok and, not
      IF "hash_make" stknot return THEN

      three pick (hKeys) no that type MAT = or, swap type NUM = or
      IF " hash_make: Keys must be type VOL" ersys return THEN

      three pick (hKeys) rows, three pick (hVals) rows <>
      IF " hash_make: rows of Keys and Vals do not match"
         ersys return
      THEN

      "Name" book, "nBins" book, hand "Vals" book, hand "Keys" book

{     Making the active bins matrix:
         Matrix ActiveBins has two columns
            col 1: sorted list of bin numbers (0-based)
            col 2: memptr to list of key rows
}
       \ Freeing mem addr in the former active bins matrix:
         Name freeActive

       \ Making hash:
         Keys nBins hash (hBins)

       \ Making Tab: column 1 holds hash bin numbers and column 2
       \ holds corresponding row numbers of Keys:
         (hBins) 1st those rows items park (hTab) yes sort (hTab)

       \ Initializing matrix ActiveBins (column 1 holds one of each
       \ different bin number from column 1 of Tab, and column 2 is
       \ null):
         (hTab) this 1st catch again sling1 rake lop (bins)
         these rows one null (bins 0) park "ActiveBins" book

       \ Building column 2 memptr list of matrix ActiveBins:
         (hTab) ActiveBins rows 1st
         DO (hTabRem) this 1st catch (hBins)
            ActiveBins I pry, (binI) those rows one fill
            (hBins hBinI) = (hR)
            (hTab hR) rake (hTabRem hTabI)
            (hTabI) 2nd catch (hRowsKeys)
            (hRowsKeys) memput (memptr)
            (memptr) ActiveBins I 2nd store
         LOOP (hTabEmpty) drop

       \ Making the hash word:
            hashText Name inlinex
            HashNames this Name grepe reach rows any not
            IF HashNames Name pile onto HashNames THEN

       \ Banking items from here into the library of the hash word:
            nBins Name "nBins" bank
            Vals Name "Vals" bank, purged are Vals
            Keys Name "Keys" bank, purged are Keys
            ActiveBins Name "BinsActive" bank, purged are ActiveBins

{     Why ActiveBins here and BinsActive in the hash word?

      Because word hash? looks to see if the word in question has an
      array BinsActive in its local library, which all hash words have.

      If this word also had an array called BinsActive, it could be
      mistaken for a hash word in the phrase: "hash_make" hash?.
}
   end

\-----------------------------------------------------------------------;
   Appendix

   Obsolete, replaced by version without hash:
  _inline: dup_sum (hA --- hA1) \ sum column 2 values at column 1 dups
{     Where values in column 1 of A are duplicates, sum the correspond-
      ing values in column 2.

      Example
         list: 3 3 4 3 5 6 4 10 10 4 ;
         list: 1 1 1 1 1 1 1 1  1  1 ; park dup_sum .m

         Since col 2 values are all 1, the col 2 sums show that there
         are: 
            3 threes, one five, one six, 3 fours, and two tens:
               Row 1:        3        3
               Row 2:        5        1
               Row 3:        6        1
               Row 4:        4        3
               Row 5:       10        2

}     [ \ Using localref makes word HASH, created below by hash_make, 
        \ go into this word's local library instead of into main:
          defname "HASH" localref is hashName 

          "%9.3f" is form, 1000 into bins 
      ]
      this 1st catch
      dup form format into Keys
      nodupes (hA1) dup form format into Kred
      (hA hA1)

      swap (hA) 2nd catch into Vals
      Keys Vals bins hashName hash_make

      hashName Kred hash_lookup dup nit push 1st
      DO peek roll totals (hBI) LOOP 
      pull tic (hB1 hB2 hB2 ... BN N) pilen
      (hA1 hB) park
   
      hashName hash_close
   end

