{  File mfil.v  December 1999 

   Copyright (c) 1999-2012  Dale. R. Williamson

   Working with market files.

   Word data and the words beginning with "y" to process daily market
   data are some of the earliest words written with tops.  They were 
   written in 1999 while earliest capabilities in tops were being de-
   veloped.  Some were updated in 2008 for real time electronic data, 
   but most retain their original form because tops still works the
   same way.

   Words were written in 2008 to process minute by minute real time 
   data (some with names beginning with "rt") and to work with the 
   real time electronic market console.  The electronic market console
   is defined by words in file mobius.n, and they must be present for 
   real time words in this file to work.

   Sometimes this file is sourced without mobius.n, but words in this 
   file that require words of mobius.n will still be present and they
   will cause errors if they are run.  However, there is no reason to 
   run them in such a case, so just let sleeping dogs lie.

   This shows the hierarchy of words in this file as they are said to 
   fetch and process incoming data and produce real time matrix P(t):  

      daysget (hFiles --- hP ht) \ P(t) from all the day Files

         dayget (qFile --- hPurged or hA ht) \ fetch day's market File

            rtget (qFile --- hPurged or hA ht) \ A(t) from File

               HGET (nSStart qMKT qFile --- hD) \ MKT data from hist Fil

                  hget1 (qFILE --- hD) \ matrix D from history FILE

                  hget_fix (hD qMKT --- hD1) \ apply fixes to O H L C Ch

               rtscrub (hD --- hA ht) \ clean up real time data in D

            (end rtget)
         
            dayget.extend (hA1 ht1 --- hA ht) \ A(t) for an entire sessi

               rdecimate (hP ht --- hY hx) \ decimate real time data

               dayadd (hP ht --- hP1) \ add columns of data to P

                  function (T) = daymodel(Pt, t) // daily model from rea

         (end dayget)

         daysget.add_model ( --- hP1) \ add future rows and model column

            function (Pt) = daysfill(Pt1, t) // fill cols that use all d

            function () = sigmodel(t) // model signals for signal()

         (hP1) t

      (end daysget)

   Note: functions daymodel(), daysfill() and sigmodel() are in file
   mobius.n.

------------------------------------------------------------------------

   Contents:

   "mfil.v" asciiload this " inline:" grepr reach dot

   inline: rdech (nHOURS --- nSTEPS) \ number of steps for hours
   inline: rdecimate (hP ht --- hY hx) \ decimate real time data

   inline: binview (qFile --- hA) \ view data from collected binary file
   inline: data (hYYY id --- hData qS qU) \ for years, return id's data
   inline: dayadd (hP ht --- hP1) \ add columns of data to P
   inline: dayget (qFile --- hPurged or hA ht) \ fetch day's market File
   inline: daysget (hFiles --- hP ht) \ P(t) from all the day Files
   inline: dV_store (hD1 --- hD) \ returned D contains traffic dV 
   inline: elec_close (qMkt -- nDT) \ close DT sec after session start
   inline: elec_open (qMkt -- nDT) \ open DT sec after session start
   inline: fix_dC (hP nBYTE--- hP1) \ fix signed numbers in P.dC
   inline: fbookX (hA qWord --- ) \ book A out of core, not into lib
   inline: G-SCALE (hA --- hA1) \ add extra scaling for graph
   inline: G-UNSCALE (hA1 --- hA) \ remove extra scaling for graph
   inline: hdchg ( --- f) \ true if a history directory has changed
   inline: HGET (nSStart qMKT qFile --- hD) \ MKT data from history File
   inline: hget_fix (hD qMKT --- hD1) \ apply fixes to O H L C Chg
   inline: hget1 (qFILE --- hD) \ matrix D from history FILE
   inline: hist_fname (hMKT --- qFname) \ name of history files for MKTs
   inline: hupdated ( --- hMKT) \ MKTs that have updated history files
   inline: pit_close (qMkt -- nDT) \ pit closes DT sec after sess start
   inline: pit_indices (nYYYmmdd --- r k) \ indices in once-a-day data
   inline: pit_open (qMkt -- nDT) \ pit opens DT sec after session start
   inline: qfile (yyy q --- qS) \ file name for quarter q in year yyy
   inline: rediff (hx hY hxprev hYprev qS --- ) \ debug price flickering
   inline: rowsto (hBig hLittle --- hRake) \ make vector for work dates
   inline: rtdates (hFiles --- hYYYMMDD) \ real time dates of files
   inline: rtfiles (qMKT nYYYMMDD n --- hFiles) \ real time file names
   inline: rtget (qFile --- hPurged or hA ht) \ A(t) from File
   inline: rtscrub (hD --- hA ht) \ clean up real time data in D
   inline: rtsetup (qMKT nYYYMMDD n --- hFiles) \ get MKT files to use
   inline: time_vec (ht --- ht1) \ bump equal values in sorted vector
   inline: tSESS (tg --- tsess) \ session start time that precedes tg
   inline: ycreate (YYY --- ) \ for YYY, create file for all mKey ids
   inline: ydata (id --- hY) \ data for id
   inline: yget (id YYY --- hY qS qU) \ fetch data written by ycreate
   inline: yload (YYY --- ) \ load quarterly files for year YYY
   inline: yname (n --- qS) \ binary file name for year n, 1900-based
   inline: yput (hFile id --- ) \ putting data for id onto file
 
   Words for animation.
   inline: r ( --- ) REPLAY ; \ shortcut to use at the console % prompt
   inline: REPLAY ( --- ) \ running replay
   inline: replay_end ( --- ) \ restore the system to real time
   inline: replay_hours (n --- ) \ run replay for the last n hours
   inline: replay_init (nt1 --- ) \ initialize for replay at start t1
   inline: replay_reset ( --- ) \ reset real time matrices 
   inline: replay_set_date (tm --- ) \ set dayget.Date for replay
   inline: replay_start_time ( --- t) \ use mouse arrow to get start t
   inline: replay_step (n --- ) \ graph P(t) on step n
   inline: replay_tvec ( --- ht) \ vector of all graph times

   Appendix.
   Run time study.  June 2008.
   Words not used.
   Work weeks per year.

------------------------------------------------------------------------

   Warning:

      Word daysget() gets very slow when there are a lot of real time 
      files in directories epath0, epath1 and epath2 (those are program
      names; directory names are probably /mdat/edat0, /mdat/edat1, 
      /mdat/edat2).  There is an archive subdirectory in each of these
      directories where older files can be moved.  

      With a little work, there is no reason for this to be so.  Each 
      console knows how many days it is graphing, so it knows all the 
      dates and could create a map of just the names of the files it 
      needs.

   Revisions and additions:
      Wed Feb 29 04:54:15 PST 2012.  Remove volume rate dV to allow
      electronic open interest again.  Volume rate had replaced open
      interest; it can be computed in daymodel() if needed.

      January 2012.  Adjust volume rate, dV, to the average rate over 
      periods of data outage, when volume remains constant and dV is 
      zero, to avoid a large and erroneous spike at the end of the out-
      age.  High dV is a central element in the model (file mobius.n),
      so erroneously high values should be avoided.  This adjustment
      was inspired by a 16 minute outage on 1-18-2011 that polluted
      nearly every market with incorrectly high dV.

      January 2012.  Compute volume rate, dV, in macro dayget.extend().
      Volume rate has units of contracts/minute and replaces the col-
      lected electronic open interest.

      November 2011.  Read collected binary data that has three added
      columns, for volume and open interest (electronic), and pit set-
      tle.

      November 2011.  Compute positions and include them in the funda-
      mental early columns made in macros dayget.extend() and daysget-
      .add_model().

      November 2011.  Eliminate coladd().  Good riddance.

      November 2011.  NADD has been moved to daysget().

      July 2010.  Word daysget() saves tSESS vector of session start
      times, which is used by words tSESS() and replay_init().

      July 2010.  New words for replaying market data on the market
      console; words of mobius.n are required.

      July 2010.  NADD has been moved to coladd(), mfil.v.

      March 2010.  Added variable NADD to mktinit() to set the number
      of future days; NADD is used by daysget(). 

      October 2009.  Second and third extra (future) days have been 
      added.  Search on string "October 2009" in daysget().  Also 
      revised is mktinit() in mobius.n.

------------------------------------------------------------------------

/* Sat Feb 11 14:35:08 PST 2012.  Exploring the effects of smoothing
   dV.  Should the effect of spikes in dV be diminished with smoothing?

   Results: Filtered dV jumps around less, as it must.  But since
   we are only keying on higher peaks, less oscillation versus more
   oscillation below the latest peak is no benefit.  When the next
   higher peak happens, only it is counted and what happened in the
   interim does not matter.  

   Conclusion: Don't do any filtering; use dV as it comes.

   Observation: dV does not just rise to a peak and then drop off 
   toward the end of session.  That was the picture I had, since dV is
   the derivative of volume, and volume is like the ever-increasing 
   logistic curve (S- curve, sigmoid curve), starting off slowly, 
   rising in the middle (when dV peaks), and then flattening off at 
   the end (when dV drops to zero).  The demo in "man logistic" shows
   logistic curves (growth curves) like a volume curve.

   But filtering shows distinct peaks at fairly consistent times (only
   looking at GC), kind of like distinct rush hours, where higher 
   trading occurs and then drops off before rising to the next peaks.

   Peaks (GC) were seen before midsession, around pit open after mid-
   session, around pit close, and near electronic close.
*/
   if(missing("mktdata")) psource("mrtsig.n");

   n = rdech(24);
   MKT = "GC";
   mktlib(MKT); // MKT wholib will show the library for MKT
   t = MKT.tm;

   dV = MKT.P[*, .dV];
   dV1 = ma(dV, rdech(0.5));
   dV2 = ma(dV, rdech(0.25));

   plot([dV1, dV2, dV], t);

------------------------------------------------------------------------

/* Thu Feb  9 11:23:00 PST 2012.  Verifying that volume rate dV inte-
   grates to volume VO.

   Results show a mean error of -1.4% for 40 sessions of GC, and a
   three sigma error of 12.3%.

   Errors appear to be negatively biased, meaning dV is being under-
   predicted.  This is what data fixups are aimed at doing, to avoid
   over prediction of dV.

   To find where dV is computed in this file (file mfil.v), search for
   this string:
      From decimated Vol in column 5 of A1, compute rate dV
*/

/* These are expressions were placed in the top of work.n, and were 
   run with word nn. */

   if(missing("mktdata")) psource("mrtsig.n");
   if(missing("compare")) source("mat.v");

   n = rdech(24);
   MKT = "GC";
   mktlib(MKT); // MKT wholib will show the library for MKT
   t = MKT.tg;

// Volume rate computed from collected volume, subject to fixups and
// normalized to contracts per minute:
   dV = foldr(MKT.P[*, .dV], n);
   dV = claw(dV, nullc(integer(dV))); // remove null future columns

// Collected volume:
   V = foldr(MKT.P[*, .VO], n)[*, 1:cols(dV)];

// Proving that integration of dV produces collected volume V:
   V0 = endmost(V, 1)'; // session end collected
   V1 = endmost(partials(dV)*rdech.SEC/60, 1)'; // session end calc

   S = stats1(compare(V1, V0)');
   <<
      "          Mean error % = " S 2 pry "%8.4f" format + .  nl
      "           Min error % = " S 1 pry "%8.4f" format + .  nl
      "           Max error % = " S 3 pry "%8.4f" format + .  nl
      " Mean + 1sigma error % = "
           S 2 pry S 4 pry sqrt 1 * + "%8.4f" format + .  nl
      " Mean + 3sigma error % = "
           S 2 pry S 4 pry sqrt 3 * + "%8.4f" format + .  nl
   >>
/*
Here is the result for 40 sessions collected to the date below (the 
40th session is still open):

Thu Feb  9 12:08:04 PST 2012

[tops@plunger] ready > nn
          Mean error % =  -1.4412
           Min error % = -28.3130
           Max error % =   0.0058
 Mean + 1sigma error % =   3.1248
 Mean + 3sigma error % =  12.2568
*/
------------------------------------------------------------------------

   How to fix bad real time data:

      First, be sure the bad data is not from end of day processing
      where a bad pit high or low may have been obtained.  Check file
      /mdat/latest.dat and if it has bad data, fix it in the quarterly 
      file, like /mdat/jul-sep.10.  Copying the electronic high or low 
      to the pit high or low is probably ok to do.  Then exit the elec-
      tronic console window and restart it.  On start up, the changed 
      quarterly file will be sensed and newday() will be rerun to make 
      a new bin file, like mdata10.bin.

      The procedure described below adds lines to word hget_fix() to 
      make changes to the data after it has read from disk for specific
      markets and dates that are bad.  Note that it does not update the
      disk--the bad data remains on the disk--so the changes to word
      hget_fix() must remain.

      This shows the hierarchy of words in this file to fetch and pre-
      process incoming data to produce real time matrix A(t):  

         rtget (qFile --- hPurged or hA ht) \ A(t) from File

            HGET (nSStart qMKT qFile --- hD) \ MKT data from hist File

               hget1 (qFILE --- hD) \ matrix D from history FILE

               hget_fix (hD qMKT --- hD1) \ apply fixes to O H L C Chg

            rtscrub (hD --- hA ht) \ clean up real time data in D

      Word hget_fix contains lines to correct gross errors in files for
      specific markets and specific sessions before the data reaches 
      word rtscrub.  Word rtscrub clips latest C to keep it within high
      H and low L.

      NOTE, November 2011.  New word viewbin() in this file lets you 
      view a binary data file, like "1091231_GC.bin" shown in the de-
      tailed example below, with a simple phrase at the ready prompt:

         [tops@plunger] ready > epath0 "1091231_GC.bin" + binview (hA)

      Here is an example of fixing bad data for GC for the session from
      Dec 30 to Dec 31, 2009.  

      The file name contains the date when the session ends, which might
      be tomorrow.  A surefire way to check is to look at the latest 
      file name in /mdat/edat; this shows 1091231_GC.bin:
         [dale@plunger] /home/dale > cd /mdat/edat
         [dale@plunger] /mdat/edat > ll *GC*
         -rw-rw-rw-  1 dale   comm      6648 Jan  2  2009 1090102_GC.bin
         -rw-rw-rw-  1 dale   comm      7536 Jan  5  2009 1090105_GC.bin
         ...
         -rw-rw-rw-  1 dale   comm     14448 Dec 30 14:40 1091230_GC.bin
         -rw-rw-rw-  1 dale   comm      9624 Dec 31 06:24 1091231_GC.bin
         [dale@plunger] /mdat/edat > 

         Insert this temporary phrase in word hget_fix where processing
         of FILE is done:

            (hD) FILE "1091231_GC.bin" = IF "HALTING" . nl HALT THEN
      
         Run tops for GC as the electronic market console would, and 
         when it halts run iview to examine the top of stack matrix 
         that contains the bad data: 

            [dale@plunger] /home/dale > tops
                     Tops 3.1.0
            Thu Dec 31 05:55:41 PST 2009
            [tops@plunger] ready > 'mobius.n' psource gc
            GC real time
            Analyzing the last 40 days ...
            HALTING

             stack elements:
                   0 matrix: _hget1  638 by 6
             [1] ok!
            [tops@plunger] ready > iview

         This shows the bad data values of 10 for O, H, L C and Chg:
                      O        H        L        C     Chg Time
            Row 394:  10941    11040    10931    11030 105 1262247120
            Row 395:  10941    11040    10931    11030 105 1262247120
            Row 396:     10       10       10       10  10 1262247240
            Row 397:     10       10       10       10  10 1262247240
            Row 398:     10       10       10       10  10 1262247240
            Row 399:  10941    11040    10931    11024  99 1262247300
            Row 400:  10941    11040    10931    11025 100 1262247360
 
         This shows the lines added to hget_fix to fix this problem; 
         they are similar to other lines already in hget_fix, and in
         this case simply discard the times of bad data: 
            FILE "1091231_GC.bin" =
            IF "D" book
             \ Out-of-bounds price values are 10.
             \ Rake out anything less than 1000:
               D 1st catch 1000 < \ O
               D 2nd catch 1000 < \ H
               D 3rd catch 1000 < \ L
               D 4th catch 1000 < \ C
               or or or
               D swap rake drop (hD1)
               purged "D" book return
            THEN

         Save updated mfil.v (without the temporary debug phrase to 
         halt and view data, but with the fixup phrase like the one 
         added above), and in the running window for GC, source mfil.v 
         to make the change take effect: 
            GCG10 11076 10931 11039 114 08:03 CST Thu Dec 31, 2009 (06:1
            GCG10 11076 10931 11041 116 08:03 CST Thu Dec 31, 2009 (06:1
            Automatic update is off; enter ... to reconnect and resume 
            % 'mfil.v' source
             word data.sizeof,0:CODE__ is redefined
             word data.date,0:CODE__ is redefined
             ...
             word yload is redefined
             word yname is redefined
             word yput is redefined
            % ...
            Automatic update is on; press Esc to disconnect and return t
            GCG10 11076 10931 11040 115 08:09 CST Thu Dec 31, 2009 (06:1

         The bad data should be gone in the new graph displayed if it
         is the current session.  Otherwise, it may be necessary to
         exit and restart the console.

------------------------------------------------------------------------

   These notes were made when words for replay animation were written 
   in July 2010.

   Why it takes so long to do something.

   Writing the words for replay animation started on Tuesday and 
   animation was working by Saturday.  A debugged version for pro-
   duction was ready by Sunday.  But it really wasn't ready, and most
   of Monday was spent debugging.  So we're coming up on a week for 
   this effort, which was supposed to take two or three days.

   But not evident in these words is the reworking of words like day-
   get(), daysget(), pGRID2() and new words like KB_LOCKED(), proc-
   ess_key() and rchop() that allow the system to work the same as be-
   fore and also made these new words (especially replay_step()) much
   easier to write.  

   Effort reworking is often hidden when simply viewing the products of
   a finished task, and accounts for much of the time by which we over-
   run our estimates of how long a task will take.  Coupled with the
   added surprises during debugging that gobble up a lot more time, it 
   is no wonder that work like this takes a lot longer than estimated.

------------------------------------------------------------------------
}
   "ydays" missing IF cal.v source THEN

   "mpath" missing IF " Require path definition: mpath" . halt THEN

   "mKey" missing "soonest_end" missing or
   IF " mfil.v: require mrc.v to be sourced first" . nl halt THEN
{
   After sourcing mfil.v, use this phrase to remake all mdataXX.bin 
   files: 

      allyears these rows 1st DO this I pry ycreate LOOP drop

   Remaking a single year: years are 1900-based.  To remake the
   binary file for 2001 (remaking mdata01.bin) run:

      "mrc.v" source
      101 ycreate # 2001

   For this example of 2001, ycreate will use these ascii files: 
      jan-mar.01, apr-jun.01, jul-sep.01, oct-dec.01
}
   1based private

\-----------------------------------------------------------------------

\  Structure of matrix columns returned by word data() (revised January
\  2008 for electronic market quotes):
   "data" list:
      "date"                          \ YYYMMDD

    \ These 12 items are read in yget():

      "delmo"                         \ del month

      "Chg"                           \ change (used when roll del mo)

      "Open"  "High"  "Low"  "Close"  \ scaled pit quotes for del mo
      "eOpen" "eHigh" "eLow" "eClose" \ scaled elec quotes for del mo

      "vol" "int" (added May 2011)    \ volume and open interest

    \ These perpetual quotes and rollover are appended in word data():
      "open"  "high"  "low"  "close"  \ perpetual pit for del mo
      "eopen" "ehigh" "elow" "eclose" \ perpetual electronic for del mo
      "roll"                          \ rollover add on

   end struct

\-----------------------------------------------------------------------

\  Structure of ascii data files read by yload() (revised January 2008):
   "mdata" "kom date open high low close chg eopen ehigh elow eclose "

\  Pit volume and open interest added Sat May 21 15:57:34 PDT 2011
\  for January 2011 and later:
   "vol int" + 

    struct

\-----------------------------------------------------------------------

\  This is the former ascii mdata structure of 9 columns, replaced in 
\  January 2008 by the one above; it is still used for older files:
\     "mdata" "kom date open high low close vol int chg" struct
   9 "old_mdata.sizeof" book \ reqd to read ascii files before Jan 2008

\-----------------------------------------------------------------------

\  Structure of the columns in matrix P of word daysget.
\  The struct has been moved to file mobius.n.  Look for the following:
\     "" (no name) 
\     list: \ columns of daysget.P

\-----------------------------------------------------------------------

\  Note: To work with real time data, some of the following words 
\  require electronic console words from mobius.n to be present.

\ Lines "#def rdech" and "#end rdech" surround this word so it can be
\ sourced by any file using word msource1.

#def rdech
   inline: rdech (nHOURS --- nSTEPS) \ number of steps for hours
    \ Return the number of time STEPS equivalent to incoming HOURS.
      [ 180 "SEC" book ] \ 3 minute step size (sample rate)
      3600 * SEC /
   end
#end rdech

   inline: rdecimate (hP ht --- hY hx) \ decimate real time data
{     Decimate real time data to a uniform time step.  Time points P
      are moved to the nearest-below time in uniform sequence x (made
      below), so that P(t) is transformed into Y(x).

      Incoming t is positive, single-valued and in ascending order for 
      word look.

      If more than one point in P(t) falls at the same t time, the last
      one survives in Y(x) (which is the last one collected, but not
      necessarily the latest since there are several sources).

      Decimation means that some key data can be missed.  This shows 
      how decimation missed the low 13511 at row 620:

         Incoming data in the six columns of P is as follows (note that
         machine times contain a small fraction not shown, making them 
         unequal: the time at row 619 is greater than the one at row
         618):
                     Open     High     Low      Last  Chg  Machine time
         Row 617:    13581    13743    13376    13564  32  1216308300
         Row 618:    13581    13743    13376    13569  37  1216308360
         Row 619:    13581    13743    13376    13569  37  1216308360
         Row 620:    13581    13743    13376    13511 -21  1216308480
         Row 621:    13581    13743    13376    13561  29  1216308540
         Row 622:    13581    13743    13376    13621  89  1216308660
         Row 623:    13581    13743    13376    13603  71  1216308720
         Row 624:    13581    13743    13376    13578  46  1216308840

         These are machine times in column 6 converted to local time
         (again, the fractional second is not shown):
            Row Local time
            617 Thu Jul 17 08:25:00 PDT 2008
            618 Thu Jul 17 08:26:00 PDT 2008
            619 Thu Jul 17 08:26:00 PDT 2008
            620 Thu Jul 17 08:28:00 PDT 2008
            621 Thu Jul 17 08:29:00 PDT 2008
            622 Thu Jul 17 08:31:00 PDT 2008
            623 Thu Jul 17 08:32:00 PDT 2008
            624 Thu Jul 17 08:34:00 PDT 2008

         These are decimation times at a 3 minute step defined by word
         rdecimate:
            Thu Jul 17 08:27:00 PDT 2008
            Thu Jul 17 08:30:00 PDT 2008
            Thu Jul 17 08:33:00 PDT 2008

         For the decimation times, these Last values from column 4 at 
         nearest-below times above will be returned:
            Decimation time                 Last
            Thu Jul 17 08:27:00 PDT 2008    13569 (from row 619)
            Thu Jul 17 08:30:00 PDT 2008    13561 (from row 621)
            Thu Jul 17 08:33:00 PDT 2008    13603 (from row 623)

         The low of 13511 at row 620 is not among them, and will not
         become part of the time record.
}
      [ "rdech" "SEC" yank "SEC" book ] \ step size
{
      Uniform times, x, cover the range t(1) to t(1) + DT, every SEC
      seconds, where DT = t(max) - t(1).  To make x, times t(1) and
      t(max) are truncated to integers divisible by SEC so the same 
      uniform time sequence is always obtained.

      "Flickering," where a previously graphed point appears to jump 
      higher or lower, may be seen when points in successive updates 
      are split up (binned) differently between neighboring times that 
      differ slightly from one update to the next.

      Always using the same time sequence avoids this type of flicker-
      ing.

      Graphed points may still appear to jump higher or lower on suc-
      cessive updates, but that is believed to be an inherent graphing
      problem due to plotting a greater number of points (from one up-
      date to the next) within the same number of screen pixels.  

      [July 2008: the reason above, a shoot-from-the-hip guess, is just
      plain wrong; the phenomenon arises from the sorting algorithm as 
      explained and fixed in word hget1.] 

      The commented-out portions of this word, and word rediff, were 
      used to isolate a sorting phenomenon that produced different 
      previous prices when a graph was updated with later prices; see 
      word hget1.
}
{
      [ 0 "t" book 0 "P" book 0 "x" book 0 "Y" book 0 "SStart" book ]

      "rtget" "SStart" yank dup SStart =
      IF drop yes ELSE "SStart" book no THEN "check" book

      check 
      IF t "tprev" book P "Pprev" book
         dup "t" book over 4 catch "P" book
         t rows 1 > 
         IF t P tprev Pprev " input" rediff THEN
      THEN
}
      (hP ht) dup (ht) 1st pry (t1)   
      (t1) SEC / integer SEC * 0 max (t1)

      (hP ht t1) over (ht) 1 endmost @ (tmax)
      (tmax) SEC / integer SEC * (tmax)

      (hP ht t1 t2) over (t2 t1) - (DT) 
      (hP ht t1 DT) SEC / 1+ (n)                  \ number of points
      (t1 n) SEC swap uniform (t1 hN) + (hx) push \ uniform points x

    \ Word look places P(t) at nearest-below uniform time x, to make
    \ Y(x):
      (hP ht) swap park (hXY) peek (hx) over cols 1- clone look (hY)
      (hY) pull (hx)
{
      check 
      IF x "xprev" book Y "Yprev" book
         dup "x" book over 4 catch "Y" book
         t rows 1 > 
         IF x Y xprev Yprev " output" rediff THEN
      THEN
}
   end

\-----------------------------------------------------------------------

   inline: binview (qFile --- hA) \ view data from collected binary file
{     Tue Nov 29 17:27:56 PST 2011

      This is a utility to provide a view of the data in a market .bin
      file, and is useful for finding erroneous rows in the data.

      The market .bin files are written by word hist_add() in file
      mget.v, and read by word hget1() in file mfil.v.

      Incoming File name includes path, date and market, like
         /mdat/edat0/1110912_GC.bin

      Notes:
         Before November 30, 2011 files have six columns of data:
            Open, High, Low, Last, Change, GMT 

         November 30, 2011 and after, files have nine columns of data:
            Open, High, Low, Last, Change, Vol, OpInt, Settle, GMT 

         After December 30, 2011 the files have a 4-byte trailer that
            gives the size of data to read.  This was added so that the
            file does not have to be deleted before the new one is writ-
            ten in hist_add() (file mget.v).

            Deleting the file just before the new one was written oc-
            casionally caused the electronic consoles to not find the
            file when they were updating at the same time. 

            It is believed that the problem was not seen before because
            three directories of data from collectors were used
            (epath0, epath1, epath2) and there was negligible chance
            that files would be missing in all three locations at the
            same time.

            Presently, only one directory of files (epath0) is being
            used because only it contains real time volume data.  When
            it is missing there are no others and the electronic
            console fails.

      Examples:
         Run this for six column data:
            epath0 "1110912_GC.bin" + binview (hA) iview
         Run this for nine column data:
            epath0 "1111130_GC.bin" + binview (hA) iview
}
      [ no "BIN" book ]

      BIN filetrue IF BIN fclose THEN

      "FILE" book

      FILE file?
      IF FILE (qFile) old binary "BIN" file

       \ After November 29, 2011, files have 6 columns instead of 9:
         FILE -path "_" chblank 1st string drop number drop "M" book
         M 1111130 < IF 6 ELSE 9 THEN "Dcols" book

       \ After December 7, 2011, files have a 4-byte trailer:
         M 1111207 > \ newer type with trailer?
         IF \ fetch size from trailer:
            BIN dup dup fsize 4 - fseek
            (nBIN) 4 fget PDP_ENDIAN import4 @ (nSize)

            (nSize) BIN rewind BIN swap (nSize) fget (hT)
         ELSE \ trust fsize to give the proper size:
            BIN dup fsize fget (hT)
         THEN

         BIN fclose (hA)
         (hT) PDP_ENDIAN import4 (hA)

         (hA) dup rows Dcols / matrix (hA) \ a matrix with Dcols columns

      ELSE
         " binview: " . FILE . " not found" . nl VOL tpurged (hA)
      THEN
   end

   inline: data (hYYY id --- hData qS qU) \ for years, return id's data
{     Incoming list YYY contains 1900-based years like 99, 100, 101.

      Quotes returned in hData are scaled.  Returned strings qS and qU 
      are the names of functions used to scale and unscale the quotes 
      for this id.

      In addition to scaled original quotes, returned data contains 
      adjusted quotes for a perpetual series that accounts for roll-
      over of delivery month.  These are columns Open, High, Low, 
      Close and eOpen, eHigh, eLow, eClose in the struct called data
      in mrc.v.

      Update Sat May 21 17:57:41 PDT 2011.  The last two columns in
      returned data contain pit volume and open interest.
}
      [ list: data.Open  data.High  data.Low  data.Close 
              data.eOpen data.eHigh data.eLow data.eClose 
        end makes Quotes ]

      true one NUM stkok and two MAT stkok three NUM stkok or and not
      IF "data" stknot nl return THEN

      "id" book, hand yearfix "Y" book

    \ Gather dates for the years listed in Y:
      no one null, Y rows 1st
      DO Y I pry dup ydays swap 10000 * +d pile LOOP (hDays)

      "word data loop" ERRset 
      depth push
      Y rows 1st
      DO 
         "data calling yget for I=" I intstr + ERRset
         id Y I pry yget "U" book, "S" book 

       \ The 12 columns of Dat on the stack returned by yget contain: 
       \    delmo Chg O H L C eO eH eL eC V I

         (hDat) these rows Y I pry weekdays <> \ partial year?
         IF Y I pry weekdays those rows less, 
            those cows null swap pile 
         THEN 
         ERR
      LOOP (hDat)
      depth pull less pilen (hDat)
      (hDays hDat) park (hData)
      ERR

    \ The matrix now on the stack contains these columns matching 
    \ the order of the first 13 columns specified by the struct 
    \ called data:
    \    date delmo Chg O H L C eO eH eL eC V I

    \ Use days when close, C, is zero to rake out inactive ones:
      (hData) this data.Close catch rake lop (hData)

      (hData) any?
      IF this again push

       \ Making perpetual quotes:
         (hData) Quotes catch (hOHLC)
         peek data.delmo catch (hDelmo)
         peek data.Close catch (hC)
         pull data.Chg catch (hChg) rolldelta (hDel) dup push

         (hOHLC hDel) those cows clone plus (hPerp)
         (hData hPerp) park (hData) \ append cols of perpetual data
         pull (hDel) park \ last column is roll delta

      ELSE no data.sizeof null (hData) \ no data
      THEN
      (hData) S U

   end

   inline: dayadd (hP ht --- hP1) \ add columns of data to P
      "dayadd" ERRset

      (ht) push (hP) 

\     Adding this day's columns from model function:
      (hP) dup peek daymodel (hT) 

      (hP hT) park

      pull (ht) drop (hP1)

      ERR
   end

   inline: dayget (qFile --- hPurged or hA ht) \ fetch day's market File
{     Fetch the day's time history from electronic market File name.

      Returned A contains price columns in the order of the struct given
      at the top of this file.  Prices in A are perpetual.

      Returned t is the time in seconds since since session start, and
      encompasses 24 hours.

      Day is a misnomer.  The data is really for a session, which lasts
      up to 24 hours and spans from one day to the next; see notes in
      the Appendix of file mobius.n that describe what a session is.

      This is the form of File name: YYYMMDD_MKT.bin.  Date, YYYMMDD,
      and market name, MKT, are part of File name.  No path accompanies
      incoming File name, and history files at epath0, epath1 and epath2
      are read and data combined and sifted for only unique records.

      Date YYYMMDD is the day the session ended, the last time the file
      was written upon.  The session started on the previous day.

      Some markets, like CT and SB, open 8.5 hours after session start,
      at 01:30 AM Central.  Before a market opens, ending values of the
      previous session are contained in A.
}
      [ 0 "timenow" book \ latest time in data from rtget()

      \ After 11-29-2011:
      \    list: 2 3 4 5 6 7 8 ; \ H, L, C, Chg, Vol, OpInt, Settle
      \ After 01-30-2012:
           list: 2 3 4 5 6 7 8 ; \ H, L, C, Chg, Vol, dV, Settle
           "RTnew" book

      \ Before 11-29-2011:
           list: 2 3 4 5 ;       \ H, L, C, Chg 
           "RTold" book
      ]
      "dayget" ERRset

      (qFile) -path "FILE" book

      FILE rtget (hA ht) any?
      IF "daysget" "LIB" yank "LIB" book
         "rtget" "Date" yank "Date" book

         (hA ht) swap (hA) 

         "hget1" "Dcols" yank 6 = \ old format, 6 columns? 
         IF (hA) RTold (hA hR) catch (hA) 
          \ Null Vol, OpInt and Settle for earlier files:
            dup rows 3 null park (hA) 
         ELSE (hA) RTnew catch (hA)
         THEN

         (ht hA) swap (hA ht)

         (ht) dup 1 endmost @ "timenow" book
{
         June 2009: return a full session of data, with constant pre-
         vious session data from session start 0 to market open, and 
         market end data from market close to end of session (86400 
         sec).

         June 2009: run rdecimate here instead of in rtget(), and always
         return a uniform set of points from 0 to 86400 - SEC.  

         Example: if SEC = 180 (data every 3 minutes), then 480 points
         from 0 to 86220 are always returned.  The very next point, at
         86400, is the 0 point for the next session.
 
   Thu Jul 15 06:26:33 PDT 2010.  Make the following region into a macro
   so it can be run from outside; word replay_step will run the macro
   on each step during a replay.

   Example:
      [tops@plunger] ready > list: 100, 200 ; (A1) list: 360, 720 ; (t1)

       stack elements:
             0 matrix: _list  2 by 1
             1 matrix: _list  2 by 1
       [2] ok!
      [tops@plunger] ready > (hA1 ht1) 'dayget' 'extend' yank (hA ht)

       stack elements:
             0 matrix: _plusd  480 by 1
             1 matrix: _lerp  480 by 1
       [2] ok!
      [tops@plunger] ready > (hA ht) park iview

      This shows the two elements created in the lists above are 
      expanded by this macro in steps of 180 seconds to fill a 
      session from t=0 to t=86220:
                      A        t
         Row 1:      100        0
         Row 2:      100      180
         Row 3:      100      360
         Row 4:      100      540
         Row 5:      200      720
         Row 6:      200      900
         Row 7:      200     1080
         ...
       Row 475:      200    85320
       Row 476:      200    85500
       Row 477:      200    85680
       Row 478:      200    85860
       Row 479:      200    86040
       Row 480:      200    86220
}
[
  {" extend (hA1 ht1 --- hA ht) \ make A and t for an entire session

       \ These phrases have been made into macro "extend" so they can
       \ be run by word replay_step.

       \ Extend A and t both directions, from first time in t back to 
       \ session start (where t = 0) and from last time in t ahead to 
       \ the step just before the next session begins (where t = 86400).

       \ If the first time in t is not zero, values from 0 to the first
       \ time are the taken to match those in the first row in A:
         (ht) dup 1st pry 0> 
         IF (ht) 0 swap pile (hA ht)
            (hA ht) over 1st (row of A) reach (hA ht Ainit) rot pile
         ELSE (hA ht) swap
         THEN (ht hA)
         (hA) dup 1 endmost pile swap (hA ht)
         (ht) 86400 "rdech" "SEC" yank - (tmax)
         (ht tmax) pile (ht)
{
         Tue Jan 31 05:48:54 PST 2012.  Decimation to 3 minute steps
         is done below in rdecimate().

         Originally, data was being sampled every 7 minutes, or so.  

         Now that the sample rate is every 1 minute (with CME E-quotes 
         subscription), decimation to 3 minute steps is kind of strange.

         Upon some study and reflection, this decimation to a larger
         step is believed to be ok.

         Three minute steps from rdecimate() do not interpolate sampled
         data, but instead take the nearest-below value in time using 
         word look().  This is adequate for prices and volume being 
         followed, and always presenting original data values rather
         than interpolated values that may have never occurred is good.

         These are the seven columns of A at this point.  
            1  2  3  4    5    6      7
            H, L, C, Chg, Vol, OpInt, Settle
}
\ Use something like the following to halt while testing in this region:
\ "HGET" "DATE" yank 1120118 = IF "HALTING" . nl HALT THEN

       \ Decimate real time data to uniform time steps from 0 (session
       \ start) to the step just before the next session (86400 minus
       \ SEC):
         (hA ht) rdecimate (hA1 ht1) \ uniform time steps

{ --- Feb 29 04:54:15 PST 2012.  Skip this override of OpInt with dV.

      Sat Mar  3 09:04:00 PST 2012.  These expressions are run in word
      dV_make() by word daysmodel().  Both words are in file mobius.n.

{        In the work below, column 6 (OpInt) will be replaced by traf-
         fic dV computed from decimated Vol.  Since original Vol num-
         bers come out of decimation, dV represents the difference be-
         tween actual volumes from the exchange.  The only question is
         how near to actual sample times are the decimated times, so 
         that rates (contracts per decimated step) are consistent.
}
         "dayget volume rate dV" ERRset

       \ From decimated Vol in column 5 of A1, compute rate dV and place
       \ it in column 6 of A1 (overwriting OpInt):
         (hA1 ht1) over 5 catch (hV) delta 0 max (hdV)

       \ Jan 18 10:22:14 PST 2012.  Create divisor D for dV/D to account
       \ for data outages when V remains constant for a number of steps
       \ and dV is zero.  This gives dV that is the average over the
       \ period of the outage rather than a big spike on the step when
       \ the outage ends:
         (hdV) 1st over rows items over 0> looking delta 1 max (hD)
         (hdV hD) /by (hdV) \ get average rate when divide by D>1

       \ Tue Jan 31 09:37:06 PST 2012.  Normalize rate dV to contracts
       \ per minute and truncate to integer:
         (hdV) "rdecimate" "SEC" yank 60 / (hdV n) / (hdV)
         (hdV) 0.5 + integer (hdV)
{
         Tue Feb  7 20:30:26 PST 2012.  Ensure no duplicates in dV to
         avoid a sorting discrepancy when values are equal.  In a real
         time system, the discrepancy arises if a new term matches an 
         earlier one, and the new one is picked, then after more rows
         arrive the old one is picked instead, and so on.  The choice
         of which equal value to take can flip-flop depending upon the 
         outcome of the sort--which rows are swapped--when there are 
         duplicates and rows are added all the time.  See discussion
         in word hget1(), "July 2008" (search for "As the length of 
         the time vector grows.")
}        (hdV) nonesame (hdV) \ adds small number so none is the same

       \ Store dV into column 6 of A1, replacing OpInt:
         (hA1 ht1 hdV) 6 3 pick (hdV 6 hA1) cram (hA1 ht1)

         ERR \ end "dayget volume rate dV"
--- }

         (hA ht) push
{
         Thu Feb  9 20:36:41 PST 2012.  Reinstate positions for 3 ses-
         sions.  Make up your mind.

         Sat Feb  4 11:24:29 PST 2012.  Remove positions for 3, 4 and
         5 sessions.

         Sun Jan 29 11:20:28 PST 2012.  Add positions for 5 sessions
         ago, .P5.

         Wed Jan 25 03:10:39 PST 2012.  Add positions for 3 and 4 ses-
         sions ago, .P3 and .P4.

         Mon Nov  7 10:27:46 PST 2011.  Positions for two prior ses-
         sions are stored in columns .P1 and .P2 (using temp names like
         R1 here, instead of P1, because P is used a lot).  

         This session's positions are simply last session's prices.

         There are no positions in the first session, only last (prior)
         session positions in the second session, and last and next-
         to-last positions in the third and subsequent sessions.

}        [ 0 "R1" book, 0 "R2" book , 0 "R3" book ]
         (hA)

         R3 "R4" book                      \ 3 sessions ago
         R2 "R3" book                      \ next-to-last session
         R1 "R2" book                      \ last session
         (hA) dup 3rd catch (hC) "R1" book \ current prices

         R2 dup type NUM = IF drop R1 THEN \ last session posn, P1
         R3 dup type NUM = IF drop R1 THEN \ session before last, P2
         R4 dup type NUM = IF drop R1 THEN \ 3 sessions ago, P3

         (hA hR1 hR2 hR3)
         4 parkn \ columns are: .H .L .C .dC .VO .dV .SE .P1 .P2 .P3
         (hA) 

       \ Wed Jan 26 19:44:40 PST 2011.  The following lines which had
       \ followed this macro are now part of it:
         (hA) 

         "dayget combining daily and real time" ERRset

         (hA) dup rows "Arows" book
         Date pit_indices (r k) 
         "k" book \ step for lagged pit data
         "r" book \ step for pit data on Date

       \ Mon Nov  7 10:51:18 PST 2011.  Stop using previous high and 
       \ low and just use C.  Discontinue the following list and fetch
       \ C by itself:
       \ list: \ pit daily: "H" "L" "C" 
       \    LIB "H" yank k pry \ previous high
       \    LIB "L" yank k pry \ previous low
       \    LIB "C" yank k pry \ previous close (also called settle)
       \ end (hP)
       \ (hP) bend (hP1) \ column into row
         LIB "C" yank k pry \ previous close (also called settle)
         (hP1) Arows repeat (hHLC)

       \ Use this for open interest constant slope at ending value:
       \ LIB "OI" yank r pry (hOI) Arows repeat
{
         Sun Oct  9 19:43:26 PDT 2011.  Uncomment this section to re-
         instate open interest.  Also uncomment a line in add_model;
         search for Sun Oct  9 19:58:26 PDT 2011.
    
       \ Use these for open interest sloping previous to ending value:
         0 Arows 1- pile (ht) 
         LIB "OI" yank dup r 1- 1st max pry (OI[r-1])
         swap              r            pry (OI[r]) pile (hOI)
         (ht hOI) park (hXY) 
         1 Arows uniform (hx) (hXY hx) lerp (hOI)

         (hA hHLC hOI) 3 parkn (hA)
}  
         (hA hHLC) park (hA)

      \  Columns of A are: .H .L .C .dC .VO .dV .SE .P1 .P2 .P3

       \ Matrix A on the stack contains price data up to last pit model
       \ column in the struct of daysget.P columns.  Add additional 
       \ columns based upon this day's data (runs daymodel() in file
       \ mobius.n:
         (hA) peek (ht) dayadd (hA)

       \ Data for the remaining columns will be added to A in daysget 
       \ after all days have been read.
         (hA) pull (hA ht)

         ERR \ end "dayget combining daily and real time"

   "} "extend" macro
]
         (hA1 ht1) extend (hA ht) "t" book

         FILE "daysget" "LATEST_FILE" yank = (f)
         IF (hA)
{
            March 2009
            After previous session pit data is loaded, remake the col-
            lected price change, column .dC, because tch site goofs it 
            up for gold (and maybe others) after the pits close (note
            that dC is not used for anything except the display line
            in market console):
}           
            (hA) tsession 7200 > \ in two hours, after pit eod is done
            IF (hA) push
               peek .C catch (hC) \ electronic latest
               peek .S catch (hS) \ pit settle
               (hC hS) - (hdC)    \ price change reported by media
               (hdC) .dC peek (hdC .dC hA) cram
               pull (hA)
            THEN (hA)
         THEN
         (hA) t

\" dayget: " FILE + " rows: " + over rows intstr + . nl

         (hA ht)

         purged "D" book
         purged "t" book

      ELSE purged
      THEN

      ERR \ end "dayget"
   end

   inline: daysget (hFiles --- hP ht) \ P(t) from all the day Files
{     This is an important word for getting yearly and real time data
      for the period determined by the real time files listed in Files.
      Returned prices in P are perpetual.

      This is the form of names in Files: YYYMMDD_MKT.bin.  

      Date, YYYMMDD, and market name, MKT, are part of File name.  No 
      path accompanies incoming File name, and history files at epath0,
      epath1 and epath2 are read and data combined and sifted for only 
      unique records.

      Arrays P and t remain booked locally in this word for other words
      to use (P is booked out of core (fbook) since it is large).
      Prices in P are perpetual.
}
      [ 3 "NADD" book \ number of future days to add

        NADD 2 max "NADD" book         \ must be at least 2
        NADD 1+                        \ total sessions = NADD + current
        86400 "rdecimate" "SEC" yank / \ session_rows
        (sessions session_rows) * "ADDrows" book 

        yes "NEW" book, "0123456789_.bin" "CHARS" book
        "" "LATEST_FILE" book \ latest file name for word dayget
        -1 "tUPDATE" book \ the last time this word ran
        0 "BYTE" book

      \ Time step, probably 180 seconds:
        "rdecimate" "SEC" yank "SEC" book 

        UDEF "tnow" book \ latest non-future row 
        purged "tSESS" book \ session start times
      ]

\" Top of daysget:" . timeprobe nl
      "daysget" ERRset

      (hFiles) "FILES" book

    \ Get MKT name from one of the file names:
      FILES 1st quote -path CHARS chblank strchop "MKT" book
{ 
      August 23, 2009.  Word mdata reads daily data and is run by auto
      when the console starts up.  Here, the program is running mdata 
      every time there is a real time update and it appears to be re-
      reading all the daily data which is a waste of time.  Commenting
      out this region is being tried.

      August 24, 2009.  With this region commented out, next session
      projected previous settle is not the same as the current price.

      Words in file mrc.v are the problem.  Word dataget runs latest_rt
      to get one line of data for the latest electronic day, and that
      takes up to 0.8 seconds because there are many electronic files
      from 2008 that somehow get processed.

      Someday all this needs to be fixed, but for now all the 2008 files
      have been moved to archive and things run faster.

      Here is the warning added to word latest_rt: 

      WARNING: this word gets very slow when there are a lot of real
      time files in directories epath0, epath1 and epath2 (those are
      program names; actual names are probably /mdat/edat0, /mdat/edat1,
      /mdat/edat2).  There is an archive subdirectory in each of these
      directories where older files can be moved.
}
{     Read yearly model data and make MKTlib (if auto() has run (file
      mobius.n), it has already run mdata() and this call will simply 
      get later real time H, L, C from word mdata>libload>dataget):
}     "daysget read yearly data" ERRset
      CATMSG push no catmsg
      MKT mdata "LIB" book \ word mdata is defined in file mobius.n
      pull catmsg
\" daysget after mdata:" . timeprobe nl
      ERR \ end "daysget read yearly data" 

      NEW (FILES rows 1 = or)
      IF \ This branch should only be run once when doing market 
         \ initialization.

         "daysget initializing early files" ERRset

         purged "P1" book
         purged "t" book

         0 "dayget" "R1" bank
         0 "dayget" "R2" bank
         0 "dayget" "R3" bank

       \ Do all but the latest file, or the latest file if only one day
       \ is being processed:
         FILES dup rows quote strchop "LATEST_FILE" book

         FILES rows 1 >
         IF FILES rows 1- ELSE 1 THEN 1st
         DO FILES I quote dayget (hP1 ht) any?
            IF (ht) FILES rows ndx I - 86400 * - (ht)
               (hP1 ht) I 1st >
               IF (hP1 ht) t swap pile "t" book P1 swap pile "P1" book
               ELSE (hP1 ht) "t" book "P1" book
               THEN
            ELSE EXIT
            THEN
         LOOP

         no "NEW" book \ this is not correct.  This should be
\ set to zero only after enough points are collected for the newest
\ day to be loaded.
\ BUT NEWEST DAY IS USUALLY READ BELOW, THE LAST DAY.
{
         To save memory, P1 is booked out of core.  Versions of fbook 
         that change a matrix to 2- or 4-byte unsigned integers are 
         used to speed up writing to disk--a file of 2-byte ints is 
         about one quarter the size of its 8-byte floating point 
         version.

         Notice: assuming unsigned ints for terms of P1 means that
         price change in column .dC of any matrix taken from disc
         will be corrupted.  In the ELSE branch further below, where
         PZ is fetched from disk, column .dC is read again as signed, 
         fixing the problem.
} 
       \ host "plunger" =
       \ host "diego" = or
         yes
         IF \ on machines with enough memory, skip doing out of core
            "mainbook" ptr "fbookX" "PTR" bank
         ELSE \ do out of core:
            P1 abs maxfetch 2drop (n) "yput" "MAX2" yank <
            IF "fbook2" \ fbook 2-byte unsigned integers
               2 "BYTE" book
            ELSE "fbook4" \ fbook 4-byte unsigned integers
               4 "BYTE" book
            THEN (qS) ptr "fbookX" "PTR" bank
         THEN

       \ Fbook writes P when name is P0, but writes PZ when name is PZ;
       \ is there something about numbers in the name?  Don't use num-
       \ bers in fbook names until this is understood (4-15-2008).

       \ Don't fbook into local lib.  There appears to be continuous
       \ new reference strings generated and stored in the library.
       \ Confine fbook to main library, and use a local macro to keep
       \ heritage words working.
         [ purged "'PZ' book" main 
           "'PZ' main" "PZ" macro ] \ local macro
         P1 "PZ" naming 'PZ' fbookX \ word in main
\" daysget: fbook PZ done" . timeprobe nl
         t  "t0"  book

\" daysget: end initialization branch" . timeprobe nl
         ERR \ "daysget initializing early files"
      ELSE
\" daysget: start ELSE branch" . timeprobe nl
       \ Use saved data t0 and PZ for the first N-1 of N files:
         t0 "t" book

       \ Fix the signed values in PZ.dC by exporting them and importing
       \ signed numbers:
         PZ BYTE fix_dC "P1" book

\" daysget: fetch t0 and PZ done" . timeprobe nl
      THEN

    \ Do the latest FILE:
      FILES rows 1 >
      IF "daysget initializing latest file" ERRset
         FILES dup rows ndx quote dayget (hP1 ht) any?
         IF (hP1 ht) t swap pile "t" book P1 swap pile "P1" book

          \ Find the row of latest (non-future) time:
            t "dayget" "timenow" yank bsearch (r f) not 
            IF (r) 1+ \ if not found, nearest below is too early; add 1
            THEN "tnow" book \ NOTE: tnow is a row in t, not a time

         ELSE 
          \ " daysget: not enough collected data; exit and try"
          \ " again later" + . nl
            t rows "tnow" book \ NOTE: tnow is a row in t, not a time
{
Not enough points have been collected.  

Need code here to fill in this session and the extra day, because
the XY table in MKTlib expects them both.

The problem fixes itself as soon as there are enough points, and the 
two missing days (this session and the extra day) appear.

Then, this branch is never taken.

June 2009: with changes made in dayget to fill out all points in a 
24-hour session, this may not be a problem any more.
}
         THEN

         ERR \ "daysget initializing latest file" 
      THEN
    NADD 1 
    DO 
{     Lookup table XY in MKTlib includes an extra day, and it starts at
      zero in the XY table for converting machine time into graph time.

      But right now, times in t start at zero on end of the day before 
      this extra day.

      To align things, subtract 86400 from t to reflect this extra day
      that is ahead of it, and the lookup table will work correctly (it
      will be used over and over for plotting).
}
      t (ht) 86400 - "t" book
      t (ht) \ times of data to this point (last should equal -180)

    \ The 15 minute period between session end and next session start
    \ is no longer recoginzed here (the data collectors are certainly
    \ aware of it).  Session end and next session start are taken to 
    \ be one time step (SEC) apart.  See dayget(), June 2009 update.

      (ht) dup 1 endmost @ SEC + (t1) \ first new point (should equal 0)
      86400 SEC / (n) \ number of points
      (t1 n) SEC swap uniform (t1 htnew) + (htnew)
      (htnew) dup rows "+rows" book \ times being added

    \ Append the new times to t:
      (ht htnew) pile "t" book
{
    \ Useful debug output that shows how t works.  Each new session
    \ starts when mod(t, repeat(86400, rows(t))) equals zero:
      I NADD = 
      IF "HALTING" . nl 
         t "%8.4f" format spaced 
         t 86400 over rows repeat mod "%8.4f" format spaced park
         t tm ctime park
         eview HALT 
      THEN
}
   LOOP
[
 {" add_model ( --- hP1) \ add future rows and model columns to P1
{
      Wed Jul 14 21:35:42 PDT 2010.  This region has been made into 
      macro add_model so it can be run from outside by replay words   
      like word replay_step.
 
      Thu Jul 15 09:58:22 PDT 2010.  How to run over and over when 
      debugging this word:
         [tops@plunger] ready > xx 'mfil.v' source 'mobius.n' psource eu

      Each call to macro add_model is like the call to daysget in 
      TGRAPH whenever there is new real time data.

      July 2010.  NADD has been moved to coladd() in mfil.v.
      March 2010.  Added variable NADD in mktinit(), mobius.n.  Now 
      NADD controls looping in the following block.  

      October 2009.  This region has been revised to add three days by
      running the following block three times.

      The following adds one day in the future to be the rightmost
      day in a graph (changed to 3 days, October 2009).
}
    "daysget.add_model above LOOP" ERRset

    P1 G-SCALE "P1" book 

    LIB "OI" yank "dayget" "Date" yank pit_indices (r k) drop (r) 
    (hOI r) reach "OI" book

      ERR \ end "daysget.add_model above LOOP" ERRset

    NADD 1st 
    DO 
    \ Build +rows new rows for new session to match new rows in t.
      P1 (hP1)

    \ Make a typical row for the +rows of the next session.  Most data
    \ is repeated, but pit end-of-day data differs because it is always 
    \ shown lagged.
      dup 1 endmost (hProw) push \ reach once and save on local stk

    \ Mon Nov 7 11:19:33 PST 2011.
    \ Real time data is the same as now; positions P1 thru P4 are not
    \ valid, but will be fixed up below just before daysfill() is run:
      peek (hProw) .H .S : catch (hPA) \ .H thru .S

    \ Sun Oct  9 19:58:26 PDT 2011.  Uncomment the following to rein-
    \ state open interest.
    \ Open interest:
    \ OI (hPA hPB OI[r]) 3 parkn (hP)

    \ Today's dayadd data that is shown tomorrow; treat latest today's
    \ C as latest settle, S, and tomorrow's opening H, L:
      dup (hP) .C pry (nC) \ latest today's C
      (hP nC) dup other (nC hP) .S poke \ settle  

      (hP nC) dup (hP nC nC) 1 G-SCALE + other (nC+1 hP)
      (nC+1 hP) .H poke \ high

      (hP nC) 1 G-SCALE - over
      (nC-1 hP) .L poke (hP) \ low

      (hP) t dup rows NADD +rows * - I +rows * + pry (nt) hand (ht)

      (hP t) dayadd (hP2) \ makes dummy data to be fixed up below

      pull drop

    \ Repeat this row to make the remaining +rows new rows:
      (hP2) +rows repeat (hP2) \ +rows constant rows for extra day
      (hP1 hP2) pile (hP1) "P1" book

    LOOP

{ ---
      July 2010.

      I don't remember why this is here.  Skip it.  It might wreck
      things now that this region is macro add_model used to replay.

    \ Reduce rows to just the latest FILES of interest:
      t tm FILES 1st quote rtdates @ 
      session_start (sec) \ machine time of session start
      (tm sec) bsearch (r f) not 
      IF (r) 1+ THEN
      t rows swap - 1+ "r" book
      r t rows <>
      IF t r endmost "t" book
         P1 r endmost "P1" book
      THEN
--- }

    "daysget.add_model after LOOP" ERRset

\" daysget: done with LOOP" . timeprobe nl

      tSESS rows 0=
      IF 
       \ Wed Jan 26 07:59:38 PST 2011.  Save rake rSESS with true at 
       \ session starts, false everywhere else.

       \ Thu Jul 22 09:46:19 PDT 2010.  From t just created, save a 
       \ vector of session start times for all sessions, including the 
       \ daysget.NADD future ones.  Use the fact that each new session
       \ starts when mod(t, repeat(86400, rows(t))) is equal to zero:
         t dup 86400 over rows repeat mod 0= dup "rSESS" book
         (rSESS) rake lop "tSESS" book
      THEN

    \ Thu Feb  9 20:36:41 PST 2012.  Added back .P3.

    \ Sat Feb  4 11:24:29 PST 2012.  Removed .P3, .P4, .P5.

    \ Sun Jan 29 11:20:28 PST 2012.  Added positions .P5 for 5 prior.

    \ Wed Jan 25 03:40:39 PST 2012.  Added positions for two more prior
    \ sessions, .P3 and .P4.

    \ Mon Nov  7 11:45:14 PST 2011.  Fix up the position columns for
    \ rows of the latest and future sessions.  Other earlier rows are
    \ ok.  Positions for two prior sessions are stored by dayget() in
    \ columns .P1 and .P2.

    \ Temp position arrays not needed, reduce memory:
      0 "dayget" "R1" bank 0 "dayget" "R2" bank 0 "dayget" "R3" bank

    \ Fetch C for the last NADD + 2 sessions to use for latest and
    \ future positions.  Each session uses +rows rows.

      P1 .C catch +rows NADD 
      (NADD) 4 + (nsess) \ plus three for: C, P1, P2, P3
      * endmost (hC) "CPOS" book

    \ Lag CPOS by one session for positions R1 and place in column .P1:
      CPOS +rows lag (hR1) +rows NADD 1+ * endmost (hR1)
      (hR1) P1 dup rows other rows - 1+ .P1 place

    \ Lag CPOS by two sessions for positions R2 and place in column .P2:
      CPOS +rows 2 * lag (hR2) +rows NADD 1+ * endmost (hR2)
      (hR2) P1 dup rows other rows - 1+ .P2 place

    \ Lag CPOS by three sessions for positions R3 and place in col .P3:
      CPOS +rows 3 * lag (hR3) +rows NADD 1+ * endmost (hR3)
      (hR3) P1 dup rows other rows - 1+ .P3 place

      0 "CPOS" book
{ --------
\ Fri Jan 27 08:03:30 PST 2012.  Skip this fixup since there are now 
\ no expressions in daymodel() that use positions.

\ Thu Feb  9 06:00:29 PST 2012.  Do the fixup.

    \ Tue Nov  8 10:09:03 PST 2011.  Call daymodel() again for the last
    \ session and the two future sessions when positions are known, to 
    \ fix up any data that uses the positions just fixed up:
      .H .S : "Cols" book \ these columns are input to daymodel()
      t +rows endmost (ht) push

    \ Last session rows:
      P1 rows NADD 1+ +rows * - 1+ "r1" book
      r1 +rows items "Rows" book \ rows for last session
\"daysget call daymodel, r1 =" . r1 .i nl
      P1 Cols catch Rows reach (hP) peek (hP ht) daymodel (hPm)
\"daysget place" . dup dims swap .i .i " at row, col" . r1 .i .Hn .i nl
      (hPm) P1 r1 .Hn place

    \ Next session future rows:
      r1 +rows + "r1" book
      r1 +rows items "Rows" book \ rows for first future session
\"daysget call daymodel, r1 =" . r1 .i nl
      P1 Cols catch Rows reach (hP) peek (hP ht) daymodel (hPm)
\"daysget place" . dup dims swap .i .i " at row, col" . r1 .i .Hn .i nl
      (hPm) P1 r1 .Hn place
      
    \ Session after next future rows:
      r1 +rows + "r1" book
      r1 +rows items "Rows" book \ rows for second future session
\"daysget call daymodel, r1 =" . r1 .i nl
      P1 Cols catch Rows reach (hP) peek (hP ht) daymodel (hPm)
\"daysget place" . dup dims swap .i .i " at row, col" . r1 .i .Hn .i nl
      (hPm) P1 r1 .Hn place

    \ Session after session after next future rows:
      NADD 2 >
      IF r1 +rows + "r1" book
         r1 +rows items "Rows" book \ rows for second future session
\"daysget call daymodel, r1 =" . r1 .i nl
         P1 Cols catch Rows reach (hP) peek (hP ht) daymodel (hPm)
\"daysget place" . dup dims swap .i .i " at row, col" . r1 .i .Hn .i nl
         (hPm) P1 r1 .Hn place
      THEN

      0 "Rows" book
      pull (ht) drop

--- }

    \ Fill rightmost columns of P1 with curves computed from all days:
      P1 t daysfill (hP1) "P1" book 

    \ Compute voice signals:
      t sigmodel

\" daysget done with model update:" . timeprobe nl

    \ Don't fbook P into local lib.  There appears to be continuous
    \ new reference strings generated and stored in the library as
    \ daysget is run over and over.  Note: On August 23, 2008, lib.c
    \ was modified to allow word bank to work for PTRs, and the problem
    \ with fbook may have been eliminated.

    \ P1 "daysget" 'P' localref fbookX \ not fbooking locally

    \ Confine fbook to main library, and use a local macro called P to
    \ keep heritage words working (ones that look here for P).
      [ purged "'P' book" main "'P' main" "P" macro ] \ local macro
      P1 dup "P" fbookX \ fbook word in main

      (hP1)
      purged "P1" book 

      ERR \ end "daysget.add_model after LOOP"

 "} "add_model" macro
]
      add_model (hP1)

    \ For symmetry, also book t (do not put this line inside macro
    \ add_model because it will cause an error in replay):
      t "t" mainbook \ host "plunger" <> IF fbook ELSE mainbook THEN

    \ Return these matrices:
      (hP1) t

      purged "t" book 

\" daysget fbook and exit:" . timeprobe nl

      time "tUPDATE" book

      ERR \ end "daysget" ERRset
   end

   "timeline" missing
   IF
    \ Word elec_open defined below will use these words and market 
    \ open times from file mget.v:
      usrpath "mget.v" + "#def timeline" "#end timeline" msource1
      usrpath "mget.v" + "#def TIMELINES" msource
   THEN

   inline: dV_store (hD1 --- hD) \ returned D contains traffic dV 
{     Tue Jan 31 07:37:06 PST 2012.  Compute traffic dV and store it
      into D1 at the column where OpInt is stored.

      Returned D has new column dV where OpInt had been.

      The lines of this word were originally in word HGET().

      November 30, 2011 and after, files have these nine columns of 
      data (earlier ones had six columns (no Vol, OpInt, Settle):

         1     2     3    4     5       6    7      8       9
         Open, High, Low, Last, Change, Vol, OpInt, Settle, GMT

      Below, column 7 OpInt will be replaced by volume rate dV.

      Tue Jan 10 11:10:40 PST 2012.  The following phrases are
      an attempt compute dV with more accuracy than the fol-
      lowing expressions formerly used in daymodel() (file
      mobius.n):

         X = looking(V, V>0 && delta(V)>0); // total volume
         dV = looking(delta(X), dup()>-1);  // traffic
}
    \ Replace open interest with volume rate (called traffic)
    \ in newer models with nine columns.
      (hD1) dup push 9 catch (ht) peek 6 catch (ht hV)

    \ Fixups: set first V to 1 or greater, then make all values
    \ of V >0 such that all delta(V) are >0 (using looking()):
      (hV) dup 1st pry 1 max over 1st poke  \ V(1)>0

    \ Doing looking(V, V>0 && delta(V)>0):
      (hV) dup 0> over delta 0> and looking \ V>0 & delta(V)>0

    \ Mon Jan 16 19:33:53 PST 2012.  Replace Vol in column 6
    \ of D with the fixed up version now on the stack:
      (hV) dup (hV) 6 peek (hV 6 hD1) cram (hV)

    \ Now make dV.  Doing looking(delta(X), dup()>-1):
      (hV) delta (dV) dup -1 > looking (hdV) \ no negative dV

\ "HGET" "DATE" yank 1120118 = IF "HALTING" . nl HALT THEN

{
      Due to looking(), some rows in dV are zero.

      Rake them out and compute rate dV using rows that remain,
      along with the corresponding rows from t: rate = dV/dt.

      This avoids over-estimating the rate by division with
      delta times, dt, that are smaller than they should be.

      Then merge the rates back to correct size with zero
      rates filled in.
}
      (hdV) dup 0= dup "R" book rake (hdV h0) push (hdV)
      (ht hdV) swap (ht)

    \ Coming from hget1(), there are no duplicate times in
    \ column 9 of D and delta t will never be zero:
      (ht) R rake drop (ht) delta (hdt)

    \ The sample rate of CME data is 60 seconds, so dt should
    \ never be less than 60 seconds; if it is a lot less, an
    \ erroneously large rate will be computed.

    \ Thu Jan 12 14:37:17 PST 2012.  Rather than build in a 60
    \ second lower bound, the following computes mean and vari-
    \ ance of steps less than 180 seconds and sets its own low-
    \ er bound on dt of mean minus 0.5*sigma:
      (hdt) dup

      (hdt) dup 180 < over 0> and rake lop (hdt1) dup rows 0=
      IF (hdt1) drop 60 hand THEN (hdt1) \ failsafe

      (hdt1) bend stats1 dup 2nd pry (mean)
      swap 4th pry (variance) sqrt (sigma)
      (mean sigma) 2 / less (minSEC) \ mean - sigma/2

      (hdt minSEC) max (hdt) \ constrain dt to minSEC

    \ Compute volume rate: delta V divided by delta t:
      (hdV hdt) /by 60 * (hdV) \ traffic, contracts per minute

    \ Merge dV with zeroes back to the size of original V:
      (hdV) pull (h0) dims null R tier (hdV) 0 "R" book

    \ Replace collected column 7 (OpInt) in D with dV:
      (hdV) 0.5 + integer 7 peek (hdV 7 hD1) cram pull (hD)

      (hD)
   end

   inline: elec_close (qMkt -- nDT) \ close DT sec after session start
{     Sat Nov 12 08:37:03 PST 2011

      DT is the number of seconds after session start (not electronic
      Mkt open) when electronic trading for Mkt will close.

      Returns DT equal to -1 if electronic Mkt is not found.
}
      [
     \ Session start at Chicago central time yesterday (see mrc.v):
       "UTCstart" "Chicago_start" yank (q17:00:00)
       (q17:00:00) >SEC 86400 - "SESS" book

     \ Times are Central:
      {"    Open     Close
         W  18:00:00 13:15:00
         C  18:00:00 13:15:00
         S  18:00:00 13:15:00
         SM 18:00:00 13:15:00
         BO 18:00:00 13:15:00
 
         LC 17:00:00 16:00:00
         LH 17:00:00 16:00:00

         CC 24:00:00 14:15:00
         SB 24:00:00 14:15:00
         KC 24:00:00 14:15:00
         JO 30:00:00 14:15:00

         HG 17:00:00 16:15:00
         GC 17:00:00 16:15:00
         SI 17:00:00 16:15:00
         PL 17:00:00 16:15:00
 
         CL 17:00:00 16:15:00
         HO 17:00:00 16:15:00
         HU 17:00:00 16:15:00
         NG 17:00:00 16:15:00
 
         SF 17:00:00 16:15:00
         EU 17:00:00 16:15:00
         JY 17:00:00 16:15:00
         MP 17:00:00 16:15:00
         BP 17:00:00 16:15:00

         US 17:00:00 16:00:00
         TN 17:00:00 16:00:00
         FF 17:00:00 16:00:00
         ED 17:00:00 16:00:00
 
         DJ 17:00:00 16:30:00
         SP 17:00:00 16:30:00
         NQ 17:00:00 16:30:00
         NK 25:00:00 15:15:00

      "} asciify noblanklines chop "TABLE" book
         TABLE 1st word drop vol2mat bend 1st over rows items park (hA)
         (hA) yes sort (hA) "Rlist" book

      {" (Mkt --- qR) \ row of data for Mkt
         Rlist swap str2num bsearch (r f)
         IF TABLE Rlist rot (hRtable r) 2nd fetch (hTABLE row) reach
         ELSE (r) drop ""
         THEN
      "} "Rfind" macro
      ]
      any?
      IF uppercase Rfind any?
         IF (qR) 3rd word drop >SEC @ SESS -
         ELSE -1 \ electronic Mkt not found
         THEN
      ELSE " elec_close: need Mkt symbol" . nl -1
      THEN
   end

   inline: elec_open (qMkt -- nDT) \ open DT sec after session start
{     Sat Nov 12 08:52:02 PST 2011

      DT is the number of seconds after session start (not electronic
      Mkt open) when electronic trading for Mkt will open.

      Returns DT equal to -1 if electronic Mkt is not found.

      Examples:
         This shows that electronic trading in CC began last night at 
         22:00 PST (00:00 Central time) for the session that ends today:
            % date . nl Mkt .
            Thu Nov 10 07:09:41 PST 2011
            CC
            % date sysdate drop session_start Mkt elec_open + ctime .
            Wed Nov  9 22:00:00 PST 2011
            %

         Electronic trading in W began yesterday at 16:00 PST (18:00 
         Central time) for the session that ends today:
            % date . nl Mkt .
            Thu Nov 10 07:16:23 PST 2011
            W 
            % date sysdate drop session_start Mkt elec_open + ctime .
            Wed Nov  9 16:00:00 PST 2011
            %

      Uses the table in elec_close.  

      The old version of this word is with functions no longer used.  
      Search for obsolete.
}
      any?
      IF uppercase "elec_close" "Rfind" localrun any?
         IF (qR) 2nd word drop >SEC @ 86400 - 
            "elec_close" "SESS" yank -
         ELSE -1 \ electronic Mkt not found
         THEN
      ELSE " elec_open: need Mkt symbol" . nl -1
      THEN
   end

   inline: fix_dC (hP nBYTE--- hP1) \ fix signed numbers in P.dC
\     Negative numbers for price changes dC, imported as unsigned ints,
\     will be wrong.

\     Fix the signed values in P.dC, received from disk as 2- or 4-byte
\     unsigned numbers, by exporting them and importing signed numbers.

      (nBYTE) dup 2 < IF drop (hP) return THEN \ return if BYTE not set

      swap (nBYTE hP) dup push
      (nBYTE hP) .dC catch swap (nBYTE) 2 =
      IF endian export2 endian import2
      ELSE endian export4 endian import4
      THEN (hB) .dC peek (hP) cram
      pull (hP)
   end

   "fbookX" missing \ do not reload; fbookX.PTR is set only once
   IF
   inline: fbookX (hA qWord --- ) \ book A out of core, not into lib
      [ "fbook" ptr "PTR" book ] PTR exe
   end
   THEN

   inline: G-SCALE (hA --- hA1) \ add extra scaling for graph
\     This word is applied in daysget, and may be redefined by programs
\     that graph; for example, see G-INIT in mobius.n.  Right now, this 
\     word does nothing.
   end

   inline: G-UNSCALE (hA1 --- hA) \ remove extra scaling for graph
\     This word may be redefined by programs that graph; for example,
\     see G-INIT in mobius.n.  Right now, this word does nothing.
   end
      
   inline: hdchg ( --- f) \ true if a history directory has changed
{     Running in real time, look at the directories where collectors 
      write files continuously to see if any file times have changed.

      This word depends upon a collector touching its directory every
      time it writes a file into it.  Word hist_add() in mget.v does
      it and so does word extract_files() in tops_rtc.
}
      [ 
{       This machine's epath0 is last in list PATHS, so it is last data
        stored when PATHS is used in hget1.  That means epath0 data will
        be taken over epath1 and epath2 if there are duplicates, under 
        the assumption that this machine's data is more current or some-
        how better than epath1 and epath2 data from remote machines:
}       epath2 epath1 epath0 3 pilen "PATHS" book 

        list: 0 0 0 ; "TIMES" book
      ]
      list: PATHS rows 1st DO PATHS I pry filetime LOOP end
      (hT) dup TIMES - totals @ 0<>
      IF (hT) "TIMES" book true ELSE drop false THEN
   end

   inline: HGET (nMstart qMKT qFile --- hD) \ MKT data from history File
{     MKT may be open, but if no valid data has been collected returned
      matrix D is purged.

      History files are read from collector directories in hget1(), and
      data combined and sifted for only unique records.
}
      rev (qFile nMstart qMKT) "MKT" book "Mstart" book

      (qFile) dup rtdates @ "DATE" book

      (qFile) hget1 (hD) \ read the collected data
      (hD) dup rows 0>
      IF (hD)
       \ Remove rows with times before Mstart, the time the market 
       \ starts trading.  

       \ Times to rake are in the last column:
         (hD) dup dup cols ndx catch (hTime) 

         Mstart < rake drop (hD)
         dup rows 0>
         IF
          \ Remove rows that have nonpositive values for O, H, L or C,
          \ which are in the first four columns of D:
            (hD) dup 1st 4 items catch 0> across -4 <> rake drop (hD)

          \ Thu Nov 10 05:54:08 PST 2011.  Remove rows where delta(H)<0
          \ or where delta(L)>0.  This procedure will not fix a case
          \ where two bad rows abut.  It really should be run until 
          \ there are no more bad rows.
            (hD) MKT hget_fix (hD)
            (hD) dup 2 catch (hH) delta 0< rake drop \ no delta(H)<0
            (hD) dup 3 catch (hL) delta 0> rake drop \ no delta(L)>0
            (hD)
         ELSE drop purged (hD)
            " HGET: matrix D has no rows later than " Mstart ctime + 
            . nl 
         THEN (hD)

      ELSE drop purged (hD)

         " HGET: matrix D from hget1 has no rows " date + . nl 

      THEN (hD) 
   end

   inline: hget_fix (hD qMKT --- hD1) \ apply fixes to O H L C Chg
\     Apply hard-coded fixes to the six columns of D for MKT.  
\     The six columns of D are: O H L C Chg Time.

      "MKT" book
      "hget1" "FILE" yank "FILE" book

    \ (hD) FILE "1091231_GC.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1110124_GC.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1110124_HG.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1110217_CL.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1110627_DJ.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1110628_DJ.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1110930_GC.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1111021_SF.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1111102_GC.bin" = IF "HALTING" . nl HALT THEN

    \ Bad rows for Nov 7, 8, 9 2011 for HG, GC and CL were due to 
    \ a bug in cme.v that incorrectly set time after the switch to 
    \ standard EST on Sunday Nov 6, 2011.  This was the first time 
    \ EST had been run; EDT when cme.v was written had worked fine.

    \ (hD) FILE "1111107_HG.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1111108_HG.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1111109_HG.bin" = IF "HALTING" . nl HALT THEN

    \ (hD) FILE "1111107_GC.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1111108_GC.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1111109_GC.bin" = IF "HALTING" . nl HALT THEN

    \ (hD) FILE "1111107_CL.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1111108_CL.bin" = IF "HALTING" . nl HALT THEN
    \ (hD) FILE "1111109_CL.bin" = IF "HALTING" . nl HALT THEN

\     These are tied to a specific file (and therefore a particular
\     date and market).  They must appear before the specific market
\     branches below:

\     Real time low is incorrectly the previous session's or just 
\     plain too low.
      0
{
      FILE "1090217_US.bin" = or
      FILE "1090217_TN.bin" = or
      FILE "1090217_GC.bin" = or
      FILE "1090406_TN.bin" = or
      FILE "1090609_HG.bin" = or
      FILE "1090615_DJ.bin" = or
      FILE "1090615_CL.bin" = or
      FILE "1090908_GC.bin" = or
      FILE "1091008_HG.bin" = or
      FILE "1091020_HG.bin" = or
      FILE "1091026_SF.bin" = or
      FILE "1091026_JY.bin" = or
      FILE "1091126_HG.bin" = or
      FILE "1091126_GC.bin" = or
      FILE "1091221_HG.bin" = or
      FILE "1100608_DJ.bin" = or
}
      FILE "1110124_GC.bin" = or
      FILE "1110125_HG.bin" = or
      FILE "1110915_HG.bin" = or

      IF "D" book
\        Note: this fixup assumes that the data in C is good.
\        The six columns of D are: O H L C Chg Time.
\        Use mmin on all rows of C to get the low, assuming C is good.
         D 4th catch (hC) 
         D rows (nt) mmin (hL) \ moving min of C over all rows
         (hL) 3rd D cram D (hD) 
         purged "D" book \ don't return because high may need fixing too
      THEN

\     Real time high is incorrectly the previous session's or just 
\     plain too high.
      0
{
      FILE "1090217_HG.bin" = or
      FILE "1090217_CL.bin" = or
      FILE "1090217_JY.bin" = or
      FILE "1090217_EU.bin" = or
      FILE "1090217_DJ.bin" = or
      FILE "1090217_NQ.bin" = or
      FILE "1090305_NQ.bin" = or
      FILE "1090330_HG.bin" = or
      FILE "1090408_US.bin" = or
      FILE "1090408_TN.bin" = or
      FILE "1090612_GC.bin" = or
      FILE "1090615_DJ.bin" = or
      FILE "1090615_CL.bin" = or
}
      FILE "1101001_DJ.bin" = or
      IF "D" book
\        Note: this fixup assumes that the data in C is good.
\        The six columns of D are: O H L C Chg Time.
\        Use mmax on all rows of C to get the high, assuming C is good.
         D 4th catch (hC) 
         D rows (nt) mmax (hH) \ moving max of C over all rows
         (hH) 2nd D cram D (hD) 
         purged "D" book return
      THEN

{     These files contain previous session data at times that follow
      the session start but precede the market's open.  The test in
      hist_add(), mget.v, is insufficient.  It looks at session start,
      not market open.
}     FILE "1090130_US.bin" = FILE "1090130_TN.bin" = or
      IF dup 6 ndx catch 1233271800 < rake drop (hD) return THEN

\     These are files that have different forms of bad data:

      FILE "1120209_GC.bin" = 
      IF "D" book
       \ Remove post session data that came in when switched from cable
       \ modem to DLINK between sessions.  All markets were affected
       \ but GC is one of the worst.  Next time, wait until nearly 3 PM
       \ when CME E-quotes has completely initialized for the next
       \ session.
         D dup 9 catch 1328826091 < \ before Thu Feb 9 16:21:31 CST 2012
         (hD hR) rake lop (hD) 
         purged "D" book return
      THEN

      FILE "1120120_CL.bin" = \ bad H 19840 toward end of session
      IF "D" book
         D 2nd catch dup (hH) 19840 < looking (hH)
         (hH) 2nd D cram 
         D purged "D" book return
      THEN

      FILE "1110928_CL.bin" = 
      FILE "1110928_HG.bin" = or
      FILE "1110928_GC.bin" = or
      FILE "1110928_DJ.bin" = or
      FILE "1110928_TN.bin" = or
      FILE "1110930_GC.bin" = or \ bad data due to site
      FILE "1111102_GC.bin" = or \ Row 803: 10 10 10 10 10 1320206400
      IF "D" book
       \ Reject any row with low < 100
       \ The six columns of D are: O H L C Chg Time.
         D dup 3rd catch 100 < rake drop (hD1)
         purged "D" book return
      THEN

      FILE "1111107_HG.bin" = 
      IF "D" book
       \ Data in early rows appears to be for later ones, meaning the
       \ time is wrong.  Rake out the bad early rows:
         D list: 2 4 5 7 9 11 12 16 19 21 23 ;
         D rows teeth rake lop (hD)
         purged "D" book return
      THEN

      FILE "1111108_HG.bin" = 
      IF "D" book
       \ Data in early rows appears to be for later ones, meaning the
       \ time is wrong.  Rake out the bad early rows:
         D list: 2 3 4 6 10 11 14 ; 
         D rows teeth rake lop (hD)
         purged "D" book return
      THEN

      FILE "1111109_HG.bin" = 
      IF "D" book
       \ Data in early rows appears to be for later ones, meaning the
       \ time is wrong.  Rake out the bad early rows:
         D list: 1 2 3 4 7 8 9 11 14 15 ; 
         D rows teeth rake lop (hD)
         purged "D" book return
      THEN

      FILE "1111107_GC.bin" = 
      IF "D" book
       \ Data in early rows appears to be for later ones, meaning the
       \ time is wrong.  Rake out the bad early rows:
         D list: 3 7 thru 9 12 14 15 17 21 22 23 25 26 27 ;
         D rows teeth rake lop (hD)
         purged "D" book return
      THEN

      FILE "1111108_GC.bin" = 
      IF "D" book
       \ Data in early rows appears to be for later ones, meaning the
       \ time is wrong.  Rake out the bad early rows:
         D list: 1 2 3 4 6 7 9 10 11 13 14 16 18 19 ;
         D rows teeth rake lop (hD)
         purged "D" book return
      THEN

      FILE "1111109_GC.bin" = 
      IF "D" book
       \ Data in early rows appears to be for later ones, meaning the
       \ time is wrong.  Rake out the bad early rows:
         D list: 3 4 5 6 8 12 13 14 15 17 18 20 23 24 ;
         D rows teeth rake lop (hD)
         purged "D" book return
      THEN

      FILE "1111107_CL.bin" = 
      IF "D" book
       \ Data in early rows appears to be for later ones, meaning the
       \ time is wrong.  Rake out the bad early rows:
         D list: 2 3 5 6 7 9 11 12 14 15 18 21 23 25 26 28 ; 
         D rows teeth rake lop (hD)
         purged "D" book return
      THEN

      FILE "1111108_CL.bin" = 
      IF "D" book
       \ Data in early rows appears to be for later ones, meaning the
       \ time is wrong.  Rake out the bad early rows:
         D list: 1 2 3 6 7 8 10 13 14 15 17 18 19 20 ;
         D rows teeth rake lop (hD)
         purged "D" book return
      THEN

      FILE "1111109_CL.bin" = 
      IF "D" book
       \ Data in early rows appears to be for later ones, meaning the
       \ time is wrong.  Rake out the bad early rows:
         D list: 2 3 4 6 8 10 12 13 14 15 18 19 20 23 24 25 ;
         D rows teeth rake lop (hD)
         purged "D" book return
      THEN

      FILE "1110921_GC.bin" = 
      IF "D" book
       \ Bad high; set bad 18800 (from cme.v) to previous good one 
       \ (from mfg.v):
         D 2nd catch dup 18800 = not looking (hH)
         (hH) 2nd D cram D (hD)
         purged "D" book return
      THEN

      FILE "1110627_DJ.bin" = 
      IF "D" book \ remove bad rows 934 and 935
         D list: 934 935 ; over rows teeth rake lop (hD) "D" book

       \ Fix bad change, where Cprev = 11881 (similar to 1110628_DJ.bin
       \ below):
         D 4 catch 11881 - (hdC)
         D 5 catch abs 10000 > (hF) 
         (hdC hF) looking (hdC) 5 D cram
         D (hD) purged "D" book (hD) return
      THEN

      FILE "1110628_DJ.bin" = 
      IF "D" book \ fix bad change, where Cprev = 12008 - 18 = 11990
{
  Row 3:    11995    12030    11994    12008       18 1309212180
  Row 4:    11995    12030    11994    12009 -1186990 1309212240
  Row 5:    11995    12030    11994    12008 -1186991 1309212420
  Row 6:    11995    12030    11994    12010       20 1309212480
}
         D 4 catch 11990 - (hdC)
         D 5 catch abs 10000 > (hF) 
         (hdC hF) looking (hdC) 5 D cram
         D (hD) purged "D" book (hD) return
      THEN
 
      FILE "1110217_CL.bin" = 
      IF "D" book \ remove bad rows 4 and 12
         D list: 4 12 ; over rows teeth rake lop (hD)
         purged "D" book (hD) return
      THEN

      FILE "1100107_GC.bin" = 
      IF "D" book
       { Bad values from tradingcharts.com (tch) equal to 11181:
         GCG10 11395 11155 11181 -184 19:03 CST Wed Jan 6, 2010 (17:07:
         GCG10 11395 11155 11181 -184 19:03 CST Wed Jan 6, 2010 (17:10:

         Bad low (and open):
         Row 6: 11294    11370    11294    11370      183 1262819460

         Rake out C equal to 11181 or L equal to 11294:
       } D 4th catch 11181 = \ C
         D 3rd catch 11294 = \ L
         or
         D swap rake drop (hD1)
         purged "D" book return
      THEN

      FILE "1091231_GC.bin" = 
      IF "D" book
       \ Out-of-bounds price values are 10.
       \ Rake out anything less than 1000:
         D 1st catch 1000 < \ O
         D 2nd catch 1000 < \ H
         D 3rd catch 1000 < \ L
         D 4th catch 1000 < \ C
         or or or
         D swap rake drop (hD1)
         purged "D" book return
      THEN

      FILE "1090227_GC.bin" = 
      IF "D" book
\        Bad high; set > 9640 to 9640
         D 2nd catch 9640 min (nH)
         (nH) 2nd D cram D (hD)
         purged "D" book return
      THEN

      FILE "1090120_EU.bin" = 
      IF "D" book
\        01-20-09: bad high; set >13150 to 13068 (initial session high)
         D 2nd catch 13068 min 
         D 4th catch max \ max of 13068 or C
         (hH) 2nd D cram D (hD) 
         purged "D" book return
      THEN

      FILE "1091026_EU.bin" = 
      IF "D" book
       \ Out-of-bounds data is 1486, when it should be 14860.
       \ Reject out-of-bounds:
         D 1st catch 2000 < \ O
         D 2nd catch 2000 < \ H
         D 3rd catch 2000 < \ L
         D 4th catch 2000 < \ C
         or or or
         D swap rake drop (hD1)
         purged "D" book return
      THEN

      FILE "1081106_SF.bin" = \ a file for SF
      IF "D" book
\        11-06-2008
\        Out of bounds low: bad C = 8304
       \ Reject out-of-bounds:
         D 4th catch 8350 < \ C
         D swap rake drop (hD1)
         purged "D" book return
      THEN

      FILE "1081017_C.bin" = \ a file for C
      IF "D" book
\        10-17-2008
\        Out-of-bounds data is 7882, when nominal data is 4030:

       \ Reject out-of-bounds:
         D 1st catch 7000 > \ O
         D 2nd catch 7000 > \ H
         D 3rd catch 7000 > \ L
         D 4th catch 7000 > \ C
         or or or
         D swap rake drop (hD1)

         purged "D" book return
      THEN

      FILE "1081127_HG.bin" = \ files for HG
      FILE "1081128_HG.bin" = or 
      IF "D" book
\        Thu 11-27-2008, Fri 11-282008: one site sent data only for
\        Wed preceding Thanksgiving.  Remove all data 16915.
       \ Reject out-of-bounds:
         D 4th catch 16915 = \ C
         D swap rake drop (hD1)
         purged "D" book return
      THEN

      MKT "US" =
      IF "D" book
{        When US data went from 32nds to 64ths (really 32.5nds), on
         about March 3, 2008, an extra digit was added to the values.
         But this system cannot yet handle the extra digit, and if it
         slips through the numbers are too big by a factor of 10.
         This fix tries to salvage such data by dividing it by 10:
}        D 1st catch dup 20000 > dup push rake 10 / peek tier \ O
         D 2nd catch dup 20000 > dup push rake 10 / peek tier \ H
         D 3rd catch dup 20000 > dup push rake 10 / peek tier \ L
         D 4th catch dup 20000 > dup push rake 10 / peek tier \ C

\        Divide Chg by 10 if any of O, H, L, C in its row was too big:
         pull pull pull pull + + + push
         D 5 ndx catch peek rake 10 / pull tier

       \ Tue Nov 29 10:34:13 PST 2011.  D has 9 cols after 11-29-2011:
         "hget1" "Dcols" yank 6 =
         IF D 6 ndx catch 6 parkn
         ELSE
            D 6 ndx catch \ Vol
            D 7 ndx catch \ OpInt
            D 8 ndx catch dup 20000 > dup push rake 10 / pull tier \ Set
            D 9 ndx catch 9 parkn
         THEN

         purged "D" book return
      THEN

{ ---------------

      MKT "JY" =
      IF "D" book
\        5-5-2008:
\        This problem with JY should be fixed; tch website changed the
\        factor for JY, and tch.v has been revised accordingly.
{
       \ Reject out-of-bounds:
         D 1st catch 50000 > \ O
         D 2nd catch 50000 > \ H
         D 3rd catch 50000 > \ L
         D 4th catch 50000 > \ C
         or or or 
         D swap rake drop 
}
      \ Try and salvage out-of-bounds:
         D 1st catch dup 50000 > dup push rake 100 / peek tier \ O
         D 2nd catch dup 50000 > dup push rake 100 / peek tier \ H
         D 3rd catch dup 50000 > dup push rake 100 / peek tier \ L
         D 4th catch dup 50000 > dup push rake 100 / peek tier \ C

\        Divide Chg by 1000 if any of O, H, L, C in its row was too big:
         pull pull pull pull + + + push
         D 5 ndx catch peek rake 1000 / pull tier

         D 6 ndx catch 6 parkn
         purged "D" book return
      THEN

      MKT "NG" =
      IF "D" book
\        Out-of-bounds data: NGF09 2225 13:51 CDT Fri Oct 17, 2008
\        Data has less than 3000 while trading is at 7000.
 
       \ Reject out-of-bounds:
         D 1st catch 3000 < \ O
         D 2nd catch 3000 < \ H
         D 3rd catch 3000 < \ L
         D 4th catch 3000 < \ C
         or or or 
         D swap rake drop (hD1)

         purged "D" book return
      THEN

      MKT "EU" =
      MKT "BP" = or
      IF "D" book
\        10-10-2008
\        Out-of-bounds data, 8.845e+07, when nominal data is 1.342e+04:
 
       \ Reject out-of-bounds:
         D 1st catch 1E5 > \ O
         D 2nd catch 1E5 > \ H
         D 3rd catch 1E5 > \ L
         D 4th catch 1E5 > \ C
         or or or 
         D swap rake drop (hD1)

         purged "D" book return
      THEN

--------------- }

      MKT "DJ" = 
      MKT "SP" = or
      MKT "NQ" = or
      IF (hD)
{
         From time to time at market close for stock indices, one of 
         the sites (or maybe the exchange) returns the closing price, 
         which is ok, but it is accompanied by 00:00 time which is the 
         previous midnight in Chicago.  The collectors continually sort
         the data on time, so the result is a new price spike appearing
         at the previous midnight that was not there before.

         This code removes any collected data rows that have a time 
         exactly equal to midnight in Chicago.
}
         (hD) dup dup cols catch integer     \ column 6 machine times
         dup 1st pry "t0" book               \ session starting time
         t0 gmtime sysdate drop (nYYYMMDD)   \ starting day in Chicago
         (nYYYMMDD) 240000                   \ next midnight in Chicago
         ltime (sec) t0 CHdiff1 - (sec) =    \ converted to machine time
         (hD hRake) rake drop (hD)           \ remove midnight Chicago

         return
      THEN

\     MKT "SB" =
\     IF "D" book
{
         5-15-2008:

NEED A PROGRAM THAT CLEANS UP A BINARY FILE ONCE AND FOR ALL, RATHER
THAN FIXING THE DATA EVERY TIME THE FILE IS USED.

Bad Low data (showing matrix P from daysget):
THIS PROBLEM WENT AWAY BECAUSE Low IS NOT PLOTTED; THE BAD DATA IS 
NEVER SEEN, BUT IT IS STILL PRESENT.
              High     Low      Last      Chg
 Row 281:     1123     1096     1111        6   
 Row 282:     1123     1096     1108        3   
 Row 283:     1123       11     1110        5   
 Row 284:     1123       11     1107        2   
 Row 285:     1123       11     1107        2   
 Row 286:     1123       11     1105        0   
 Row 287:     1123       11     1104       -1   
 Row 288:     1123       11     1104       -1   
 Row 289:     1123       11     1106        1   
 Row 290:     1123       11     1107        2   
 Row 291:     1123       11     1107        2   
 Row 292:     1123       11     1107        2   
 Row 293:     1123       11     1107        2   
 Row 294:     1123       11     1105        0   
 Row 295:     1123       11     1105        0   
 Row 296:     1123       11     1105        0   
 Row 297:     1123       11     1104       -1   
 Row 298:     1123       11     1104       -1   
 Row 299:     1123     1096     1104       -1   
 Row 300:     1123     1096     1103       -2   
 Row 301:     1123     1096     1103       -2   
}
\     THEN
   end

   inline: hget1 (qFILE --- hD) \ matrix D from history FILE
{     This word reads history files of a day's session that were saved
      by word hist_add() defined in mget.v, which is the file run by
      collectors.  Saved history files from several collectors are read
      and combined into one matrix D.

      FILE names saved by hist_add() are of the form
         1080120_US.bin, 1080121_HG.bin
      so they contain the date and the market symbol.

      Incoming FILE name may include the path, but it is discarded.
      Then history files at epath0, epath1 and epath2 are read and
      data combined and sifted for only unique records.

      Tue Nov 29 09:34:13 PST 2011.  After November 29, 2011, returned
      matrix D will have nine columns:
         O, H, L, C, Chg, Vol, OpInt, Settle, GMT

      For November 29, 2011 and earlier, the six columns of returned 
      matrix D contain the following:
         O, H, L, C, Chg, GMT

      Matrix D holds the values just as they were collected from
      different sources, and may contain errors and discrepancies.
}
      [ \"hdchg" "PATHS" yank "PATHS" book 
{
        Wed Nov 30 06:26:48 PST 2011.  Data from web sites pollutes 
        past real time volume data with zeroes since sites do not
        give volume data.  Just use real time data deposited into 
        edat0 by cme1.v on plunger.  Data from riggo goes into edat0 
        with cme1.v data, so it pollutes volume too.  Cut riggo off.  

        Riggo has been set up as a collector and retains collected 
        data locally.  Later, files from riggo can be moved to plunger
        into a new edat3.
}
      \ For now, restrict PATHS to just real time data in edat0 that
      \ contains cme1.v volume data:
        "/mdat/edat0/" "PATHS" book

        0 "BIN" book
      ]
      "hget1" ERRset

      (qFILE) -path "FILE" book

    \ After November 29, 2011, files have 6 columns instead of 9:
      FILE -path "_" chblank 1st string drop number drop "M" book
      M 1111130 < IF 6 ELSE 9 THEN "Dcols" book

      PATHS rows 1st
      DO BIN filetrue IF BIN fclose THEN
         PATHS I quote FILE catpath dup file?
         IF (qFile) old binary "BIN" file

          \ After December 7, 2011, files have a 4-byte trailer:
            M 1111207 > \ newer type with trailer?
            IF \ fetch size from trailer:
               BIN dup dup fsize 4 - fseek
               (nBIN) 4 fget PDP_ENDIAN import4 @ (nSize)
               (nSize) BIN rewind BIN swap (nSize) fget (hT)

            ELSE \ trust fsize to give the proper size:
               BIN dup fsize fget (hT)
            THEN

            (hT) PDP_ENDIAN import4 (hA)
            (hA) dup rows 0>
            IF (hA) dup rows Dcols / matrix (hA) \ MAT with Dcols cols

             \ Matrix rows are in the order collected.  Keep only the
             \ first occurrence of rows with duplicate times in the
             \ last column:
               (hA) dup dup cols ndx catch swap park nodupes
               2nd Dcols items catch (hA)

{           July 2008

            As the length of the time vector grows, sorting it will 
            produce a different sorted result for rows with times that 
            match, since the sort algorithm may or may not swap rows 
            with equal time values.  This leads to inconsistent prices 
            when the prices in other columns of these swapped (and then
            not swapped) rows do not also match each other.  

            It took months to figure out that sorting was the cause of 
            the vexing problem of once-in-a-while flickering of previ-
            ous prices, sometimes hours earlier, that would change and 
            then change back again on successive graph updates.  

            Duplicate times are not allowed because they affect binary 
            searches and interpolation functions, and word time_vec is
            run at a later step to make all times ever-so-slightly dif-
            ferent.  But that is too late to make a difference in the 
            subtle sorting problem that has already happened here.

            Adding a random number in this loop to all the times elimi-
            nates the matching times and the sorting problem when they
            are sorted below, and makes later use of word time_vec un-
            necessary.

            Times in column 6 are in seconds, to the nearest minute [in
            2011, data from CME E-quote is to nearest second], so there
            will be duplicates.  To eliminate duplicates in times from
            all collections read by this loop, add a random number equal
            to 1/10 loop index plus a small random number (see example
            of collected machine times below):
}
               (hA) dup dup cols catch (ht)
               seed0 seedset \ always the same random sequence
               dup rows 1 random 1e-4 * I 1- 0.1 * + + 
               over cols ndx (hA ht nCol) 2 pick (hA) cram
            ELSE (hA) drop 0 Dcols null (hA)
               " hget1: from " FILE + " rows equal zero" + . nl
            THEN
         ELSE (qFile) " hget1: " swap + -trailing " not found" + . nl
            0 Dcols null (hA)
         THEN
      LOOP
      PATHS rows pilen (hA) \ pile the several collections

      (hA) dup rows 0>
      IF (hA) yes over cols sorton (hA) \ sort on time, last column
{
         This is how collected machine times in column 6 look at this 
         stage.  Fractions beginning with .0 are from the collector 
         that contributes files to PATH(1), fractions .1 are from the 
         PATH(2) collector, and so on.  

         The rightmost digits are added random fractions to guarantee 
         that no times are exactly the same (although running nodupes 
         in the loop above and the 0.1 times loop index fractions 
         should take care of that problem and added random fractions
         are probably not necessary):

            1216159320.2000510693

            1216159380.0000510216
            1216159380.1000509262
            1216159380.2000331879

            1216159500.1000332832
            1216159500.2000072002

            1216159800.0000333786

            1216159860.1000070572
            1216159860.2000267506

            1216160160.1000268459
            1216160160.2000808716
}
         (hA) "_hget1" naming (hD)

      ELSE (hA) drop purged
         " hget1: " FILE + " gives purged matrix " + date + . nl
      THEN

      BIN filetrue IF BIN fclose THEN

      ERR
   end

   inline: hist_fname (hMKT --- qFname) \ name of history files for MKTs
{     For incoming list of one or more markets, this word returns the 
      file names for the session now running, or about to start.

      History file names are of the form 1080117_SB.bin.

      The date YYYMMDD in a history file name, like 1080117 in the exam-
      ple above, is the date when the collection session ended, not when
      it started on the previous day.

      Example: It is about 2:50 PM (PDT) on March 17, 2008 and the next
      session starts at 3:00 PM and runs until tomorrow at 2:45 PM.

         The history file name for W is given below as 1080318_W.bin,
         and contains tomorrow's date (0318), since names of history
         files (of sessions) contain the date when the session ends:

            [tops@plunger] ready > date . nl "W" hist_fname .
            Mon Mar 17 14:49:52 PDT 2008
            1080318_W.bin
            [tops@plunger] ready >

      Example: File names for all MKTs in tracklist:

         [tops@plunger] ready > tracklist hist_fname eview

}
      hand chop uppercase (hMKT)
      right justify ".bin" tail (hMKT)

      left justify (hMKT)

      time dup soonest_end + ctime sysdate drop intstr (qYYYMMDD)
      "_" + (qS)

      (hMKT qS) nose
      dup rows 1 = IF 1st quote THEN
   end

   inline: hupdated ( --- hMKT) \ MKTs that have updated history files
{     Looks in the three directories epath0, epath1 and epath2, and 
      returns a list of MKTs with updated real time files in any one 
      of them.

      Returned MKT is a purged VOL if nothing has been updated.
}
      [ tracklist rows 3 * 1 null "FTIMES" book ]

    \ File names contain session date and MKT name.  See hist_fname. 
      tracklist hist_fname (hFname) push \ names for this session

    \ Add paths to file names:
      peek epath0 nose
      peek epath1 nose
      pull epath2 nose
      3 pilen (hFnames) \ path+filename

    \ The file for a MKT will not exist until it opens for this session;
    \ when a file is not found, word filetime returns 0 which is useful
    \ here.

      dup filetime dup FTIMES <> \ filetime = 0 if file not found
      (hFnames hFtimes f) swap "FTIMES" book
      (hFnames f) rake lop

      (hFnames) any? 
      IF -path "_" tug "_.bin" chblank chop noq_alike \ extract MKTs
      ELSE VOL tpurged
      THEN
   end

   inline: pit_close (qMkt -- nDT) \ pit closes DT sec after sess start
{     Thu Oct  6 11:40:27 PDT 2011

      DT is the number of seconds after session start (not electronic
      Mkt open) when the pits for Mkt will close.

      Returns DT equal to -1 if pit Mkt is not found.
}
      [
     \ Session start at Chicago central time yesterday (see mrc.v):
       "UTCstart" "Chicago_start" yank (q17:00:00)
       (q17:00:00) >SEC 86400 - "SESS" book

     \ Times are Central:
      {"    Open     Close
         W  09:30:00 13:15:00
         C  09:30:00 13:15:00
         S  09:30:00 13:15:00
         SM 09:30:00 13:15:00
         BO 09:30:00 13:15:00
 
         LC 09:05:00 13:00:00
         LH 09:10:00 13:00:00

         CC 07:00:00 10:50:00
         SB 07:10:00 11:30:00
         KC 07:30:00 11:30:00
         JO 09:30:00 12:30:00

         HG 07:10:00 12:00:00
         GC 07:20:00 12:30:00
         SI 07:25:00 12:25:00
         PL 07:20:00 12:05:00
 
         CL 08:00:00 13:30:00
         HO 08:00:00 13:30:00
         HU 08:00:00 13:30:00
         NG 07:20:00 14:00:00
 
         SF 07:20:00 14:00:00
         EU 07:20:00 14:00:00
         JY 07:20:00 14:00:00
         MP 07:20:00 14:00:00
         BP 07:20:00 14:00:00
 
         US 07:20:00 14:00:00
         TN 07:20:00 14:00:00
         FF 07:20:00 14:00:00
         ED 07:20:00 14:00:00
 
         DJ 08:30:00 15:15:00
         SP 08:30:00 15:15:00
         NQ 08:30:00 15:15:00
         NK 08:00:00 15:15:00

      "} asciify noblanklines chop "TABLE" book
         TABLE 1st word drop vol2mat bend 1st over rows items park (hA)
         (hA) yes sort (hA) "Rlist" book

      {" (Mkt --- qR) \ row of data for Mkt
         Rlist swap str2num bsearch (r f)
         IF TABLE Rlist rot (hRtable r) 2nd fetch (hTABLE row) reach
         ELSE (r) drop ""
         THEN
      "} "Rfind" macro
      ]
      any?
      IF uppercase Rfind any?
         IF (qR) 3rd word drop >SEC @ SESS -
         ELSE -1 \ pit Mkt not found
         THEN
      ELSE " pit_close: need Mkt symbol" . nl -1
      THEN
   end

   inline: pit_indices (nYYYmmdd --- r k) \ indices in once-a-day data
    \ Fri Apr 22 14:31:12 PDT 2011

    \ Returned r is the index for today, and k is the index for yes-
    \ terday.

    \ These lines were moved from rtget() to make this separate word.

    \ Get the once-a-day data indices for today and yesterday:
    \ Thu Apr  8 08:11:40 PDT 2010: If bsearch for Date fails and it
    \ is not the latest Date, pit must have been closed (rather than
    \ not being open yet) and r and k are equal, pointing to the pre-
    \ vious (nearest-below) date.

      "Date" book

      LIB "DATE" yank (hD) dup 1 endmost @ (date) Date =
      IF (hD) rows (r) "r" book \ index of today's data
         r 1- "k" book \ index for last session closed pit data
      ELSE (hD) Date bsearch (r f)
         IF \ found Date:
            (r) "r" book  \ index of today's data
            r 1- "k" book \ index for last session closed pit data
         ELSE \ Date not found:
            (r) "r" book \ index of nearest-before session
            r "k" book   \ index for last session closed pit data
         THEN
      THEN r cop k cop
   end

   inline: pit_open (qMkt -- nDT) \ pit opens DT sec after session start
{     Thu Oct  6 14:21:53 PDT 2011

      DT is the number of seconds after session start (not electronic
      Mkt open) when the pits for Mkt will open.

      Returns DT equal to -1 if pit Mkt is not found.

      Example: This shows that SF pits opened today at 5:20 PDT
         % date . nl Mkt .  
         Mon Mar 23 09:37:34 PDT 2009
         SF
         % date sysdate drop session_start Mkt pit_open + ctime .
         Mon Mar 23 05:20:00 PDT 2009
         % 

      Uses the table in pit_close.  

      The old version of this word is with functions no longer used.  
      Search for obsolete.
}
      any?
      IF uppercase "pit_close" "Rfind" localrun any?
         IF (qR) 2nd word drop >SEC @ "pit_close" "SESS" yank -
         ELSE -1 \ pit Mkt not found
         THEN
      ELSE " pit_open: need Mkt symbol" . nl -1
      THEN
   end

   inline: qfile (yyy q --- qS) \ file name for quarter q in year yyy
\     q is 1-based, yyy is 1900-based (so yyy=101 is 2001).
\     Returned S includes the path.
      [ "(q) three * two -" "qmonth" macro ] \ month = q*3 - 2 
      (q) qdx \ quarters are 1-based for macro qmonth
      qmonth 100 * swap yearfix 10000 * + qfilename mpath swap + ;

{  The next word was used for debugging, and is now commented-out.

   inline: rediff (hx hY hxprev hYprev qS --- ) \ debug price flickering
{     Called by rdecimate when solving the flickering previous price 
      problem.  It was due to variations in sort algorithm behavior 
      when there are duplicates in the vector being sorted and when 
      it becomes longer from one time to the next as new points are
      added.  

      Word hget1 contains the changes to fix the problem.
}
      no "xok" book
      no "yok" book

      "S" book
      "Yprev" book "xprev" book 
      1st Yprev rows items reach "Y" book
      1st xprev rows items reach "x" book
      x xprev - null? not
      IF 1st x rows items x xprev 3 parkn x xprev - rake lop
         S " rediff: " + "x has " + x rows intstr + " rows" + . nl
         S " rediff: " + "x changed at these rows: " + . nl itext . nl
      ELSE yes "xok" book
      THEN
      Y Yprev - null? not
      IF 1st Y rows items Y Yprev 3 parkn Y Yprev - rake lop
         S " rediff: " + "Y has " + Y rows intstr + " rows" + . nl
         S " rediff: " + "Y changed at these rows: " + . nl itext . nl
      ELSE yes "yok" book
      THEN
      xok yok and IF S " rediff: x and y ok" + . nl THEN
   end
}
   inline: rowsto (hBig hLittle --- hRake) \ make vector for work dates
{     Making a vector to expand data to all work dates in the year
      (work dates mean Monday through Friday dates).

      hBig contains all the year's work dates; hLittle contains the
      dates of actual data, when activity was recorded during the year.

      Returned Rake has the same number of rows as hBig, with ones at 
      inactive days and 0 at all others.

      Using Rake created here to tier individual data sets will bring 
      all data to the same set of dates.  Rows for inactive dates will
      contain 0, usually meaning a holiday.
}
      "L" book, "B" book, one B rows one fill, L rows 1st
      DO (hRows) B L I pry bsearch
         IF no other rot poke ELSE drop THEN
      LOOP freed is L, freed is B
   end

   inline: rtdates (hFiles --- hYYYMMDD) \ real time dates of files
{     This is the form of incoming File names: YYYMMDD_MKT.bin.
      Date, YYYMMDD, and market name, MKT, are part of File name.

      Date YYYMMDD is the day that the electronic MKT closed, or will
      close, some time in the afternoon in Chicago.  MKT had opened
      some time the previous day after pit trading had closed.

      Example: getting the latest date for MKT files:
         MKT 0 1 rtfiles rtdates @ (nYYYMMDD)
}
      -path 1st 7 items catch numerate
   end

   inline: rtfiles (qMKT nYYYMMDD n --- hFiles) \ real time file names
{     Return the names of real time files at epath for MKT for n days
      ending at YYYMMDD.  Latest file name, the one for YYYMMDD, is
      first.

      If YYYMMDD = 0, these are the n latest files; the latest one may
      be for MKT still open.

      If n is greater than the number of files, all names are returned.

      This is the form of File name: YYYMMDD_MKT.bin.  Date, YYYMMDD,
      and market name, MKT, are part of File name.

      Date YYYMMDD is the day that the electronic MKT closed, or will
      close, some time in the afternoon in Chicago.  MKT had opened
      some time the previous day after pit trading had closed.

      Pit trading for MKT that ran during electronic trading is also
      dated YYYMMDD.  To have the same date for electronic and pit
      trading is the reason electronic trading for MKT is dated on its
      closing day and not its opening day.  This also means that the
      last date when the physical file is written upon is YYYMMDD, so
      the file date should agree.
}
      [ {" (qMKT --- )
           "_MKT." "MKT" rot strp
           epath dirnames (hDir) 
           epath1 dirnames (hDir) 
           epath2 dirnames (hDir) 3 pilen  
           dup rot grepr any?
           IF (hDir hRows) reach no sort \ latest first
              noq_alike
           ELSE (hDir) drop VOL tpurged
           THEN
        "} "rtf" macro
      ]
      "n" book "YYY" book uppercase dup "MKT" book
      (qMKT) rtf dup rows 0>
      IF YYY 0=
         IF 1st n other rows min items reach
         ELSE (hFiles) dup push
            (hFiles) 
            "YYY_MKT.bin" "YYY" YYY intstr strp "MKT" MKT strp (qS)
            (hFILES qS) grepe any?
            IF @ "r" book
               peek r n peek rows r - 1+ min items reach
            ELSE " rtfiles: file for " YYY intstr + " not found" + . nl
               VOL tpurged
            THEN
            pull (hFiles) drop
         THEN
      ELSE (hPurged) \ " rtfiles: no files found" . nl
      THEN
   end

   inline: rtget (qFile --- hPurged or hA ht) \ A(t) from File
{     Tue Nov 29 10:49:13 PST 2011.  Revised for nine columns in D,     
      which occur after 11-29-2011.

      The eight columns of returned A are: 
         1     2     3    4     5    6    7       8
         Open, High, Low, Last, Chg, Vol, OpInt, Settle

      Prices in returned A are perpetual.  Times in t span a single 
      session, which is always less than 24 hours.

      Returned t is relative to session start (so if the market opened
      after session start, the first value in t will not be zero) and 
      runs until the market closes for the session.  (See notes "What 
      is a session?" in the Appendix of file mobius.n.)

      This is the form of File name: YYYMMDD_MKT.bin.  Date, YYYMMDD,
      and market name, MKT, are part of File name.  No path accompanies
      incoming File name, and history files at epath0, epath1 and epath2
      are read and data combined and sifted for only unique records.
}
      (qFile) -path "FILE" book
      FILE "_." chblank dup 1st word drop number
      IF "Date" book

         2nd word drop "MKT" book
         MKT "lib" + "LIB" book \ this is the convention for LIB

      ELSE " rtget: " FILE + " is an invalid File name " + . nl
         "Invalid" "Date" book
         purged return
      THEN

      "rtget" ERRset

{     Time of session start.

      Date just booked above is from the file name, and files are given
      dates for the day a session ends (the last day the file is written
      upon).

      The day a session starts is the day before Date at the machine 
      time given by session_start which is booked here into SStart:
}     Date session_start (sec) "SStart" book
 
    \ We now have the factors to make the real time quotes H, L, C
    \ into perpetual, which will be done below.

      "rtget calling HGET" ERRset

    \ Get history data for SStart+elec_open and times later from FILE:
      SStart MKT elec_open + \ market open time
      MKT FILE HGET (hD)     \ collected history data

      ERR \ end "rtget calling HGET"

      (hD) dup rows 0> (f)
      IF "rtget process HGET(D)" ERRset
\        Rows of D are already in ascending time order, and column 6 
\        of D contains machine time.

         (hD) "D" book
         Date pit_indices (r k) "k" book "r" book
{
         Convert history prices in D into perpetual, the only valid
         form for doing math on the numbers since D values can be
         strings like 11428 for US (114 and 28/32) or 8646 for W (864 
         and 6/8).  In addition, perpetual prices account for the dif-
         ference between contract months using offset rolldelta.

         Conversion to perpetual is done by computing scaled data to 
         resolve the problem of 32ths and 8ths and to make integer 
         values where a change of 1 equals the minimum price change
         (tic), and then a correspondingly scaled rolldelta is added 
         to align different contract months:
}        D 1st 4 items catch (hA)     \ O, H, L, C
         (hA) Cqs                     \ A into scaled data for Name
         LIB "ROLLDELTA" yank r pry + \ plus rolldelta for date r
         (hA)

       \ Park to A the price change from D; it is only scaled, because
       \ rolldelta does not apply to changes:
         (hA) D 5 ndx catch Cqs (hA hdC) park (hA) \ scaled for Name

         "hget1" "Dcols" yank 6 = 
         IF (hA) D 6 ndx catch (hA ht) park (hD) \ time in col 6 again

         ELSE (hA) \ Tue Nov 29 10:34:13 PST 2011.  After 11-29-2011,
          \ D has nine columns instead of six:
            D 6 catch \ Vol
            D 7 catch \ OpInt
          \ Apply same scaling as on D(1:4) above: 
            D 8 catch (hP) Cqs LIB "ROLLDELTA" yank r pry + \ Settle
            D 9 catch (hGMT)
            (hA hVol hOpInt hSettle hGMT) 5 parkn (hD)
         THEN
         (hD)

       \ Scrub D; rows of D are already in ascending time order, and 
       \ the last column of D contains machine time:
         (hD) rtscrub (hD htUTC) dup rows any
         IF (hD htUTC) 
          \ Subtract session start from t.  Some markets open after 
          \ session start, and so the first time in new t will not be 
          \ zero:
            (htUTC) SStart - (ht) \ quote times (sec) relative to SStart

            (hA ht)
         ELSE " rtget: no data from rtscrub, halting" . nl HALT
         THEN

         ERR \ end "rtget process HGET(D)"
         (hA ht)

      ELSE 
       \ If past session start, return the close of the previous pit
       \ session, with zero change.  Return time equal to zero, which
       \ means session start:
         drop time SStart > 
         IF LIB "C" yank (hC) r pry 4 clone 0 park (hA)
            0 hand (ht)
         ELSE purged
         THEN
      THEN
      (hA ht)

      purged "D" book
      purged "t" book

      ERR \ end "rtget"
   end

   inline: rtscrub (hD --- hA ht) \ clean up real time data in D
{     Tue Nov 29 10:49:13 PST 2011.  Revised for nine columns in D,     
      which occur after 11-29-2011.

      The incoming nine columns of D are: 
         Open, High, Low, Last, Chg, Vol, OpInt, Settle, time

      The eight columns of returned A are: 
         1     2     3    4     5    6    7       8
         Open, High, Low, Last, Chg, Vol, OpInt, Settle

      Returned time, t, is UTC stored by the collector.

      Rows with nonpositive entries for O, H, L, C have already been 
      removed in HGET().
}
      (hD) dup rows 0>
      IF "D" book
         D 1st catch (hO) \ leave O as-is
         D rows "NT" book
         D 2nd catch (hH) NT mmax dup push \ moving max over all rows
         D 3rd catch (hL) NT mmin dup push \ moving min over all rows
         D 4th catch (hC)

       \ Clip C.
       \ December 2008: for crisp definitions of support and resistance
       \ levels using L and H, constrain C to never equal L or H by 
       \ adding or subtracting one tic.  This assumes L never equals H,
       \ which is very likely due to mmin and mmax above:
         (hO hH hL hC) pull (hL) 1+ max (hC) \ C never equals L
         (hO hH hL hC) pull (hH) 1- min (hC) \ C never eauals H

       \ Fetch Chg:
         (hO hH hL hC) D 5 ndx catch (hChg)

         (hO hH hL hC hChg) 5 parkn (hA)

       \ November 2011: After 11-29-2011, D has hget1.Dcols = 9 columns:
         "hget1" "Dcols" yank 6 = 
         IF (hA) D 6 ndx catch (ht)
         ELSE (hA) D 6 ndx 3 items catch (hA hD) park (hA)
            (hA) D 9 ndx catch (ht)
         THEN

         purged "D" book

         (hA ht)
      ELSE (hPurged) purged (hA ht)
      THEN
   end

   inline: rtsetup (qMKT nYYYMMDD n --- hFiles) \ get MKT files to use
\     Incoming YYYMMDD is the latest of n days to graph for MKT; if
\     YYYMMDD = 0, the very latest n days are processed.
      [ 20 "N" book ] \ minimum data points when n = 1
      "rtsetup" ERRset

      (qMKT YYYMMDD n) "n" book "YYYMMDD" book
      (qMKT) uppercase "MKT" book

      tracklist MKT tracklist chars blpad grepr rows any not
      IF " rtsetup: market " MKT + " not found" + . nl ERR return THEN

      MKT YYYMMDD n rtfiles (hFiles) any? not
      IF " rtsetup: no files for market " MKT + . nl ERR return THEN

      ERR
      "rtsetup for " MKT + ERRset

      (hFiles)
      n 1 = \ if less than N data points for n = 1, add another day
      IF (hFiles) epath (hFiles) over + hget1 rows N <
         IF (hFiles) drop 1 n bump MKT YYYMMDD n rtfiles (hFiles) THEN
      THEN
      (hFiles) reversed (hFiles) \ reorder Files to latest last
      ERR
   end

   inline: time_vec (ht --- ht1) \ bump equal values in sorted vector
{     This word can be used for any sorted vector t to ensure no equal 
      values.

      Incoming times in t are to the nearest second, and are in ascend-
      ing order.

      But some times may be equal, making processing difficult.  For
      instance, sorting prices by time gets squirrelly (see the prob-
      lem described in word hget1), and duplicates in an interpolation 
      vector produce garbage.

      Make times unequal by adding DIFF seconds to one of two adjacent
      that are equal.

      Just bumping a value may get it above its equal partner but might
      make it equal to the neighbor above.  And more than two might be 
      equal in the first place, so bumping just one is not enough.

      The BEGIN ... WHILE ... REPEAT loop below cycles around until
      no values are equal.  It uses the fact that a true flag equals
      -1, so "(-1) abs X *" gives the add-on 1*X needed when the match
      flag is true, and zero when it is false and equal to 0; +1 and
      -1 lag is used to align the rows going up and down.

      Example:

         List A has four equal values:

            [tops@plunger] ready > list: 0 0 0 0 1 ; "A" book A bend .m
             Row 1:        0        0        0        0        1

         This word keeps adding DIFF until no values are equal.  When 
         DIFF = 0.01, this new A is obtained:

            [tops@plunger] ready > A time_vec (hAnew) bend .m
             Row 1:        0     0.01     0.02     0.03        1 
}
      [ 0.01 "DIFF" book ]
      (ht)
\     Temporarily affix a large negative number so the loop will
\     not fail (and run forever) on bad data, such as all values
\     being equal:
      (ht) -INF swap pile (ht) \ append t to a negative number

      "time_vec" ERRset

      (ht)
      BEGIN (ht) dup dup -1 lag - 0= (hf)
         (hf) abs DIFF * (hDT) 1 lag
         (ht hDT) dup totals @ 0<> (f)
      WHILE (ht hDT) + (ht)
      REPEAT (ht hDT) drop

      2nd over rows 1- items reach (ht) \ remove the negative number
      ERR
   end
 
   inline: tSESS (tg --- tsess) \ session start time that precedes tg
{     Thu Jul 22 10:43:26 PDT 2010

      Example (run in real time at the electronic market console):
         This shows that the session at 10:49 PDT on Thursday started
         on Wednesday at 15:00 PDT:
            % time dup ctime . nl (time) tg tSESS tm ctime . nl
            Thu Jul 22 10:49:36 PDT 2010
            Wed Jul 21 15:00:00 PDT 2010
}
      @ (nt) "daysget" "tSESS" yank dup rot bsearch drop pry
   end

   inline: ycreate (YYY --- ) \ for YYY, create file for all mKey ids
\     Uses table mKey in main library to obtain ids.  Called word yput
\     also uses table mKey and makes it part of the file created.

      [ "mKey" missing IF " ycreate: require mKey table" . HALT THEN
         PDP_ENDIAN is etype \ endian type of words on binary mfile
         no "silent" book
      ]
      (YYY) yearfix, scalar "mfile" book

      dup (YYY) yname (qFile) silent not 
      IF (qFile) dup " ycreate: making " . . nl THEN
      (qFile) dup deleteif new (qFile) "mfile" binary file

      (YYY) dup yload (hY)

    \ Bank data into word ydata:
      (YYY hY) "ydata" "Ydata" bank 
      (YYY) "ydata" "Year" bank

      mKey rows 1st
      DO mKey I reach key.id pry (id) any?
         IF (id) mfile swap yput THEN
      LOOP

    \ Free the large array banked into ydata:
      freed "ydata" "Ydata" bank

{     Write the key that yget will use (this took a while to under-
      stand again in 2008: word list: simply puts the output of 
      fwrite into a 2x1 vector and leaves its handle on the stack; 
      what's being called ptr is the offset to the trailer just
      written, and len is its length):
}     list: mKey etype export8 mfile fwrite ; (h[ptr, len])

    \ Write the address (ptr and len) used to fetch the key that 
    \ yget will use:
      (h[ptr, len]) etype export4 mfile fwrite 2drop

      mfile fclose
   end

   inline: ydata (id --- hY) \ data for id 
{     Year and quarterly data for Year, Ydata, must be banked into 
      the library of this word before it is run.

      Returned Y has a row for every workday (Mon - Fri) of Year.
      Days in Y with where there was no data contain nulls.

      Rows in Y for workdays correspond to workdays for Year con-
      tained in file cal.bin and fetched below by word ydays.
}
      [ " ydata: require 'Year', and associated data in Ydata" 
        says msg 
      ]
      "Year" local? not 
      IF msg . nl drop no no null return THEN

      Ydata these mdata.kom catch
      10000 /f integer (Ydata Y(id) ) \ ids of all
      rot (id) those rows once fill less rake drop \ data for id
      these rows any
      IF Year ydays those mdata.date catch
         100 /f integer rowsto \ vector to tier with
         these totals ontop, them cols null \ null inactive days
         swap tier \ data with zeroes at inactive days
      THEN
   end

   inline: yget (id YYY --- hY qS qU) \ fetch data written by ycreate
{     This word gets scaled data for id from binary file for year YYY.

      Revised May 2011 to include pit volume and open interest that 
      follow the last electronic price, eC (see yput()).

      The 12 columns returned in Y contain: 
         delmo Chg O H L C eO eH eL eC V I

      The rows of Y match the dates returned by the phrase YYY ydays.

      Binary data was stored as two or four byte integers by word yput 
      as follows:
         delmo Chg O H L C eO eH eL eC V I

      Returned strings qS and qU are the names of the scale and unscale
      functions, respectively, to use with this id.
}
      [ "ycreate" "etype" yank "etype" book \ endian type of mfile
        ten "Ycols" book
      ]
      yearfix
      scalar "mfile" book \ file handle looks closed on re-entry

      this yname "mfile" old binary file
      mfile filetrue not IF return THEN 

      "yget" ERRset

      (id YYY) weekdays "Rows" book

    \ Fetching the ptr to key and key len in last 8 bytes:
      mfile again this file.size pry, eight less fseek, eight fget
      etype import4 (hKey)

      mfile that 1st pry (ptr) fseek \ seek to key ptr
      mfile swap 2nd pry (len) fget etype import8 \ get key bytes
      mKey rows fold "Key" book \ mKey when created (by ycreate)

      (id) Key key.id catch, that (id) bsearch not
      IF 2drop mfile fclose, zero Ycols null "" "" ERR return \ no data
      THEN "Krow" book (id) drop

      mfile Key Krow key.ptr fetch (ptr) fseek
      mfile Key Krow key.len fetch (len) its 0= 
      IF drop fclose, zero Ycols null "" "" ERR return \ no data
      THEN fget into Data mfile fclose

      Data 1st Rows two (bytes) star items those cows teeth
      claw (hInt2 hData) into Data, (hInt2) etype import2 (hDelmo)

      Data 1st Rows two (bytes) star items those cows teeth
      claw (hInt2 hData) into Data, (hInt2) etype import2 (hChg) 

    \ Eight columns of price data, that may be incoming as 2- or 4-byte
    \ integers:
      Data 1st Rows eight (columns) *
      Key Krow key.size fetch two = (f) dup push
      IF two ELSE four THEN (bytes/col) * items (hChars)
      (hData hChars) those cows teeth claw (hInt hData) into Data
      (hInt) etype pull (f) IF uimport2 ELSE import4 THEN
      Rows foldr ([O,H,L,C,eO,eH,eL,eC])

    \ Four-byte volume:
      Data 1st Rows four (bytes) star items those cows teeth
      claw (hInt4 hData) into Data, (hInt4) etype import4 (hVol)

    \ Four-byte open interest:
      Data 1st Rows four (bytes) star items those cows teeth
      claw (hInt4 hData) into Data, (hInt4) etype import4 (hOI)

    \ At this point, Data has no columns.

      (hDelmo hChg hOHLC hVol hOI) five parkn (hY)
      "_yget." Key Krow key.id fetch suffix naming (hY)

      Key Krow key.scale fetch num2str notrailing (qS)
      Key Krow key.unscale fetch num2str notrailing (qU)

      ERR
   end

   inline: yload (YYY --- ) \ load quarterly files for year YYY
{     This loads data from original text files--made this way since 
      1978 when Marylynn and I went to CDC (Control Data Corporation,
      near LAX) and keypunched a batch of prices from the Wall Street
      Journal onto cards, then made a tape and dragged it to USC and
      put it on their PDP-10.  Access to USC by phone over the ArpaNet
      was possible to outside customers at $8 (prime), $4 (night) and
      $2 (graveyard) per CPU minute.  I worked a lot of night and 
      graveyard hours writing Fortran programs to analyze that data.  
      My teletype terminal over the phone (stick the receiver into 
      those rubber cups) ran at about 30 characters per second.

      This word reads the quarterly data files for YYY and banks the 
      result into the library of word ydata.

      The data read has the following column structure given by struct 
      mdata:
         kom date open high low close chg eopen ehigh elow eclose

      If the year is incomplete, the matrix contains data through
      the latest partial quarterly file.

      Update Sat May 21 15:57:34 PDT 2011.  Pit volume and open interest
         are added (they were removed in January 2008).  Quarterly files
         for 2011 have been revised to include this data in new columns
         12 and 13 (see Appendix of file eod.v):

            kom date open high low close chg eopen ehigh elow eclose 
               vol int

         For years before 2011, zeroes for vol and int are stored by 
         this word.  Volume and open interest data is in all the old
         daily .dat files, and the procedure in the Appendix of eod.v 
         could be run on them to recover vol and int for years before
         2011.
}
      [ \ This is the struct of old files (before 2008):
        \    kom date open high low close vol int chg
        \ Do not return the old file columns vol and int:
             list: 1 6 thru, 9 ; "use_old" book

        \ This fetches [open, high, low, close] from struct mdata:
             list: mdata.open, mdata.high, mdata.low, mdata.close ;
             (hList) "use_pit" book

          0 0 park "VOL+OI" book

      ]
      no "2008fix" book
      no "old_files" book

      (YYY) yearfix "YYY" book

      YYY 108 < \ read old column size if before 2008
      IF "old_mdata.sizeof" main "read_cols" book 
         yes "old_files" book
      ELSE 
         YYY 110 > \ read two fewer columns if 2010 or earlier
         IF "mdata.sizeof" main "read_cols" book
         ELSE "mdata.sizeof" main 2 - "read_cols" book
            yes "2008fix" book 
         THEN
      THEN 

      depth push

      four 1st
      DO (hY) YYY I qfile this file?
         IF (qFile) read_cols asciiread (hY) 2008fix 
            IF VOL+OI over rows repeat park THEN \ null VOL and OI
         ELSE (qFile) drop 
         THEN
      LOOP 

      depth pull less pilen

      old_files
      IF (hY) use_old ndx catch (hP)  \ use_old columns
         dup use_pit catch (hE)       \ repeat pit for electronic cols
         VOL+OI over rows repeat (hV) \ null VOL and OI
         (hP hE hV) 3 parkn (hY)
      THEN (hY)
   end

   inline: yname (n --- qS) \ binary file name for year n, 1900-based 
\     Names like: mdata99.bin, mdata00.bin, mdata02.bin, ... .
\     Returned S includes the path.
      yearfix "mdata" swap this 99 > IF 100 less THEN this ten <
      IF int "0%d" ELSE int "%d" THEN format 
      ".bin" cat cat, "mpath" main swap cat 1st quote
   end

   inline: yput (hFile id --- ) \ putting data for id onto file
{     These 10 columns of data from word ydata for id are written to 
      File by this word: 
         delmo Chg O H L C eO eH eL eC
      where symbols for electronic data begin with letter e (May 2011:
      revised for 12 columns of data; see below).

      Before writing to File, price data is scaled into integers.  

      Columns of delmo and Chg are written as 2-byte integers.  If all 
      values are less than MAX2, columns O, H, L, C, eO, eH, eL and eC
      are written as 2-byte integers; otherwise, they are written as
      4-byte integers.

      The date is not stored on File.  Since there is a row for every
      workday of the year (null rows on holidays), the dates are known 
      from calendar function ydays (file cal.v).

      Pointer and length for this id in main array mKey are updated
      when the data is written, and when data for all markets has been 
      written, mKey is appended to the file for use later to retrieve 
      the data for any market (see word ycreate where mKey is written 
      and word yget where it is read).

      Update Sat May 21 15:57:34 PDT 2011.  Pit volume and open interest
         are added (they were removed in January 2008).  Quarterly files
         for 2011 have been revised to include this data in new columns
         12 and 13 (see Appendix of file eod.v).  Output from this word
         now contains these 12 columns of data:

            delmo Chg O H L C eO eH eL eC V I

         For files before 2011, V and I are zero (but see note in word
         yload regarding years before 2011).
}
      [ "ycreate" "etype" yank "etype" book \ endian type for hFile

        list: mdata.open  mdata.high  mdata.low  mdata.close
              mdata.eopen mdata.ehigh mdata.elow mdata.eclose
        end "mdata.prices" book

        list: mdata.vol mdata.int ; "mdata.vol+oi" book

        2 16 pow "MAX2" book \ less than 65536 for 2 byte integer
      ]
      mKey key.id catch, over bsearch not
      IF drop " yput: id not found:" . lop .i return THEN makes Krow

    \ Zero ptr, len and size in table mKey:
      zero mKey Krow key.ptr store
      zero mKey Krow key.len store
      zero mKey Krow key.size store

      (id) ydata (hY) dup rows 0= IF (hFile hY) 2drop return THEN
      (hFile hY) "Y" book

    \ Delivery month:
      Y mdata.kom catch 10000 those rows one fill /mod drop (delmo) 
      etype export2 (hD)

    \ Pit Chg:
      Y mdata.chg catch (chg)
      (chg) mKey Krow key.scale fetch num2str main (hA) \ into integer
      (hA) etype export2 (hChg)

    \ Pit and electronic prices (eight prices, unsigned):
      Y mdata.prices catch ([O,H,L,C,eO,eH,eL,eC]) 
      (hA) mKey Krow key.scale fetch num2str main (hA) \ into integer

      (hA) chain dup maxfetch 2drop (n) MAX2 <
      IF \ store 2-byte integers
         two mKey Krow key.size store \ set 2 bytes in key table 
         (hA) etype export2 (hOHLC)
      ELSE \ store 4-byte integers
         four mKey Krow key.size store \ set 4 bytes in key table 
         (hA) etype export4 (hOHLC)
      THEN (hOHLC)

    \ Pit volume and open interest:
      Y mdata.vol+oi catch ([VOL OI]) chain (hV)
      (hV) etype export4 (hV)

    \ Combine and write to file:
      (hD hChg hOHLC hV) four parkn (hA)
      (hFile hA) swap fwrite (toptr tolen)

    \ Store the file ptr and data len into mKey, to be used later to
    \ retrieve this market (Krow):
      (tolen) mKey Krow key.len store
      (toptr) mKey Krow key.ptr store

      purged is Y
   end

\-----------------------------------------------------------------------

{  Words for animation.

   The following words work with the words for real time data in this 
   file, which in turn require electronic console words from mobius.n 
   to be present.

   Fri Apr 22 17:19:36 PDT 2011.  The index for once-a-day data like 
      pit close has been wrong during replay, but it was not noticed 
      until previous pit close has been included in the model.  New 
      word pit_indices() gets it right for use by rtget() and dayget(),
      and new word replay_set_date() is called by replay_start_time()
      and replay_step() to set dayget.Date to the date during replay.
      
   Wed Jan 26 19:44:40 PST 2011.  Macro daysget.extend used in word
      replay_step() did not go far enough in preparing the model in
      the current session being replayed.  This produced haywire 
      graphs for reversed time positions being made in daysmodel().

      Since daysget.extend was built from inline phrases in daysget(),
      it was a simple matter to extend the macro by using more of the
      phrases that follow it to include a necessary call to daymodel().
}
   inline: r ( --- ) REPLAY ; \ shortcut to use at the console % prompt

   inline: REPLAY ( --- ) \ running replay
{     Sat Jul 17 05:11:25 PDT 2010

      This word can be stopped and started to resume where it left off.
      But running word ...() to start real time processing again will 
      cause replay_end() to be run and settings will be lost.

      Getting ASCII character values for the letter-key table (called
      key numbers here, but key numbers (the numbers of keys) are 
      actually different; see word KR in file plot.v and the utility
      in the appendix of sys/ukey.v for getting actual key numbers):

         [tops@plunger] ready > "getch dup push pile peek emit \
            pull NL =" "dokey" inlinex, 0 1 null BEGIN dokey UNTIL \

         abcdefghijklmnopqrstuvwxyzHJKL

          stack elements:
                0 matrix: _pile  27 by 1
          [1] ok!
         [tops@plunger] ready > .i
          Column 1:
              97     98     99    100    101    102    103    104
             105    106    107    108    109    110    111    112
             113    114    115    116    117    118    119    120
             121    122     72     74     75     76    10
}
      [ {" Table of letter, key number pairs:
           a 97  b 98  c 99  d 100 e 101 f 102 g 103 h 104 i 105
           j 106 k 107 l 108 m 109 n 110 o 111 p 112 q 113 r 114
           s 115 t 116 u 117 v 118 w 119 x 120 y 121 z 122
           H 72  J  74 K  75 L 76
        "} words dice (hA hN) numerate swap vol2mat bend park
           yes sort "LTABLE" book

        {" (nKey --- qLetter) \ fetch Letter for Key from LTABLE
           (nKey) LTABLE swap bsearch (r f)
           IF (r) LTABLE swap 2nd fetch num2str strchop (qLetter)
           ELSE (r) drop "" \ no LTABLE entry, return empty string
           THEN (qLetter)
        "} "LKEY" macro

        no "KB_LOCKED" book \ yes while this word controls keyboard
        no "BUSY" book      \ yes while a macro is running

      \ Step sizes (times shown are for sample rate = 3 minutes; sample
      \ rate is set in word rdech()):
         1 "ST1" book \ 3 minutes
         5 "ST2" book \ 15 minutes
        20 "ST3" book \ 1 hour
        80 "ST4" book \ 4 hours

        0 "NSTEP" book \ current number of steps from start time

        "'replay_init' 't1' yank UDEF <>" "OK" macro

{       The following macro is run from outside by word process_key() 
        in file mobius.n whenever a key pressed in the graphics window
        needs to be processed.  

        Here we don't want to process the key, but just want to show 
        the prompt again so it is the latest line in the text display.

        See word auto() in mobius.n for an example where the key is 
        processed, and note the macro's different placement relative
        to getcht2().

        Update.  Initially this macro just dropped nt and nK and ran 
        show_prompt():
           "(nt nK) 2drop show_prompt" "process_key" macro

        But running auto.process_key works fine, and is handy for 
        changing the displayed curves while running replay, so instead
        of 2drop(), run macro auto.process_key() and then show_prompt().

        Now during replay, just move the mouse cursor into the graphics
        region and press a key to get the program to come here and then
        zip back and run auto.process_key():
}       "(nt nK) 'auto' 'process_key' localrun show_prompt" 
        "process_key" macro

        {"
           BUSY IF return THEN

           out 0=
           IF OK \ show keys that can be used:
              IF "replay keys: Esc r h j k l "
              ELSE "replay keys: Esc r "
              THEN (qS) .
           THEN

        "} "show_prompt" macro

      \ Key letter macros:

        "OK IF ST4 NSTEP incr NSTEP replay_step THEN" "L" macro \ >>>>

        "OK IF ST3 NSTEP incr NSTEP replay_step THEN" "l" macro \ >>>

        "OK IF ST2 NSTEP incr NSTEP replay_step THEN" "K" macro \ >>

        "OK IF ST1 NSTEP incr NSTEP replay_step THEN" "k" macro \ >

        "OK IF NSTEP ST1 - 0 max 'NSTEP' book NSTEP 0= "
        "IF beep THEN NSTEP replay_step THEN" + "j" macro \ <

        "OK IF NSTEP ST2 - 0 max 'NSTEP' book NSTEP 0= "
        "IF beep THEN NSTEP replay_step THEN" + "J" macro \ <<

        "OK IF NSTEP ST3 - 0 max 'NSTEP' book NSTEP 0= "
        "IF beep THEN NSTEP replay_step THEN" + "h" macro \ <<<

        "OK IF NSTEP ST4 - 0 max 'NSTEP' book NSTEP 0= "
        "IF beep THEN NSTEP replay_step THEN" + "H" macro \ <<<<

        "replay_start_time (nt) replay_init 0 'NSTEP' book "

\ Wed Jan  4 11:30:52 PST 2012.  A reference to sigmodel() should
\ not be in such a basic function.  Remove it:
\       '"sigmodel" exists? IF no "sigmodel" "replay" bank THEN ' +

        "0 replay_step" + "r" macro
      ]
      Mkt chars 0> not IF return THEN

      "REPLAY" ERRset

      yes "KB_LOCKED" book
      yes "replay_step" "SHOW" bank
      no "BUSY" book

      OK not IF yes "BUSY" book r no "BUSY" book THEN

      BEGIN
         "REPLAY loop" ERRset
         show_prompt
         getcht2 (ntime nC) \ sit here waiting for a key
         "nc" book (ntime) drop
         nc BL >
         IF nc LKEY (qKey) dup local? (qS f)
            IF (qS) yes "BUSY" book (qS) local no "BUSY" book
            ELSE (qS) drop
            THEN
         THEN
         nc ESC? (f) IF yes ELSE out cr0 no THEN
         ERR
      UNTIL

      no "KB_LOCKED" book

\ Wed Jan  4 11:30:52 PST 2012.  A reference to sigmodel() should
\ not be in such a basic function.  Remove it:
\     "sigmodel" exists? IF no "sigmodel" "replay" bank THEN

      out cr0
      ERR
   end

   inline: replay_end ( --- ) \ restore the system to real time ops
    \ Fri Jul 16 11:43:13 PDT 2010

      "replay_end" ERRset

    \ Clean up from replay:
      purged "daysget" "t" bank \ discard t created by add_model
      purged "replay_init" "Psess" bank
      UDEF "replay_init" "t1" bank

    \ Put real time items back the way they need to be:
      replay_reset

      ERR
   end

   inline: replay_hours (n --- ) \ run replay for the last n hours
   \ Sun Nov  7 13:42:19 PST 2010

      [ 1 "ST" book yes "SHOW" book ]

      SHOW "replay_step" "SHOW" bank

   \  Mimic replay_start_time:
      "daysget" "tnow" yank "TNOW" book
      TNOW "replay_start_time" "tnow" bank
      replay_tvec (ht) TNOW (ht tnow) pry
      "replay_start_time" "tLatest" bank

   \  Run replay_init:
      replay_tvec (n ht) TNOW rot (n)
      (TNOW n) rdech 1+ - (r1) pry tm (nt1) replay_init

   \  Run a replay_step loop:
      0 "REPLAY" "NSTEP" bank
      0 "replay_step" "n" bank
      BEGIN
         "REPLAY" "NSTEP" yank replay_step
         ST "REPLAY" "NSTEP" localrun incr
         "replay_step" "Rstep" yank TNOW >=
      UNTIL
   end

   inline: replay_init (nt1 --- ) \ initialize for replay at start t1
{     Wed Jul 14 18:37:19 PDT 2010

      The session for replay contains incoming replay start time t1.

      Curves that apply to the replay period replace real time arrays
      daysget.P and daysget.PZ.  The former are saved, and are put back
      when the real time console resumes. 
 
      Curves for the entire session being replayed are booked into the
      local matrix Psess.  Then just the portion up to the current step,
      Rstep (see below), is used by word replay_step() to generate the
      curves for the next replay step.

      Fetching data from the library of this word:
         Graph time of replay start: replay_init.t1
         Matrix P: replay_init.Pmat \ from Earliest to Latest
         Matrix PZ: replay_init.PZmat \ from Earliest to Rsess

      How times align (names for row numbers are shown; adjacent rows
      are separated by SEC seconds):

         Earliest   Rsess     Rstart    Rstep       Rend2     Latest
         |          |         |         |           |              |
                   |
               Rend1

         Earliest is the very first time point in all the data.
         Rend1 is the session end preceding the one being replayed.
         Rsess is start of the session being replayed.
         Rstart is start of replay.
         Rstep is the current step being played; Rstep = Rstart+n. 
         Rend2 is the end of the session being replayed.
         Latest is the end of NADD future sessions (probably two)
            that follow Rend2.
}
      [ UDEF "t1" book
        "rdecimate" "SEC" yank "SEC" book \ time step (seconds)
        86400 SEC / "SESSrows" book \ rows per session
        "daysget" "ADDrows" yank "ADDrows" book

        "'daysget' 'P' localrun" "Pmat" macro   \ P, all cols
        "'daysget' 'PZ' localrun" "PZmat" macro \ PZ, initial rows, cols
      ]
      "replay_init" ERRset

    \ Save items needed upon return to real time console:
      replay_reset
      Pmat (hP, real time) "Psave" fbook
      PZmat (hPZ, real time) "PZsave" fbook

      (nt1) tg "t1" book \ turn incoming machine time into graph time

    \ Find the row corresponding to t1:
      replay_tvec t1 bsearch (r f) drop "Rstart" book
      Rstart "replay_step" "Rstart" bank 

    \ "Rstart: n, date" . Rstart .i sp 
    \  replay_tvec Rstart pry tm ctime . nl

    \ Find the ending time for the session that precedes the one that 
    \ contains t1, using vector of start times daysget.tSESS:
      replay_tvec (ht) "daysget" "tSESS" yank (hSess)
      (ht hSess) dup t1 bsearch (ht r f) drop (ht r) pry SEC - (n)

      (ht n) (ht n) bsearch (r f) drop (r)
      (r) "Rend1" book              \ row of previous session end
      Rend1 "replay_step" "Rend1" bank 

      Rend1 SESSrows + "Rend2" book \ row of current session end 
      Rend2 "replay_step" "Rend2" bank 

    \ "Rend1: n, date" . Rend1 .i sp replay_tvec Rend1 pry tm ctime . nl
    \ "Rend2: n, date" . Rend2 .i sp replay_tvec Rend2 pry tm ctime . nl

    \ Override P and PZ with rows resized for replay and saved like 
    \ daysget() does; these are used by daysget.add_model:
      Pmat (hP) 1st Rend1 ADDrows + (Latest) items reach (hP)
      (hP) "P" fbookX \ as done in daysget

      PZmat (hPZ) 1st Rend1 items reach (hPZ)
      (hPZ) "PZ" fbookX \ as done in daysget

    \ From P, fetch the submatrix from Rsess to Rend2 with columns
    \ equal to the columns of PZ.  This matrix will be used over and
    \ over by replay_step:
      Pmat Rend1 1+ SESSrows items reach (hPsess)
      (hPsess) 1st PZmat cols items catch (hPsess) "Psess" book

    \ The following makes the console driver run replay_end() when real
    \ time resumes, and overwritten P and PZ used during replay will be
    \ replaced by Psave and PZsave containing former real time data:
      yes "..." "REPLAY" bank \ set flag in console driver

    \ Wed Dec 28 21:55:36 PST 2011.
      -1 "daysfill" "rprev" bank   \ makes daysfill recalculate latest
      true "daysfill" "never" bank \ needed for full recalculation

      ERR
   end

   inline: replay_reset ( --- ) \ reset real time matrices 
    \ Sun Jul 18 19:59:47 PDT 2010

    \ This word is run by replay_end() when returning to real time
    \ processing, and by replay_init() when initializing a session.

      "replay_reset" ERRset

    \ Put real time back in order:
      "auto" "tnow" yank "daysget" "tnow" bank

      "Psave" missing not
      IF Psave any?
         IF (hP) "P" fbookX \ as done in daysget
            purged "Psave" fbook
         THEN
      THEN

      "PZsave" missing not
      IF PZsave any?
         IF (hPZ) "PZ" fbookX \ as done in daysget
            purged "PZsave" fbook
         THEN
      THEN

      ERR
   end

   inline: replay_set_date (tm --- ) \ set dayget.Date for replay
{     Fri Apr 22 13:47:23 PDT 2011

      Set dayget.Date to the date corresponding to machine time tm, for 
      use by macro dayget.extend during replay. 

      When the system is put back into real time collection (by running
      ...), an unidentified function along the way returns dayget.Date
      to the current date.
}
      LIB "DATE" yank dup (hD hD) \ table of session dates

    \ Fake out word gmtime to give date and time in Chicago:
      rot (tm) @ (nt) dup CHdiff1 + gmtime (nsec)   
      (nsec) sysdate \ date and time in Chicago

    \ " replay_set_date: YYYmmdd" . over .i " HHMMSS" . dup .i nl

    \ It just ain't as easy as setting a darned date, as the rest of
    \ these lines show.

    \ New sessions start at 17:00 Central time, and pit sessions are
    \ dated tomorrow if it is before midnight (24:00).  So if it is
    \ after session start, but before midnight (when the date will
    \ change), add 1 to the date:
      (hD hD nYYYmmdd nHHMMSS) 170000 >=
      IF (hD hD nYYYmmdd) 1+ THEN \ pit data is dated tomorrow

    \ Must always search table D to get nearest date because tm or 
    \ YYYmmdd+1 may be on a weekend or a holiday:
      (hD hD nYYYmmdd) bsearch (r f) drop (hD r) pry 
      (nYYYmmdd) "dayget" "Date" bank

    \ " replay_set_date: " . "dayget" "Date" yank .i nl
   end

   inline: replay_start_time ( --- t) \ use mouse arrow to get start t
    \ Wed Jul 14 17:55:59 PDT 2010

    \ Returned time is machine time, not graph time.

      "replay_start_time" ERRset

      "auto" "tnow" yank "tnow" book \ row of the present time
      replay_tvec (ht) tnow pry "tLatest" book

      UDEF "csr_str" "X" bank

      COLS cr0
      "replay_start_time: left click on a start point, then " 
      "return the mouse cursor to this window and hit Esc" + 
      (qS) COLS .9 * .out nl

      "Left click on a start point > " . 

      BEGIN 
         getcht2 (ntime nC) "nc" book (ntime) drop
         out cr0
 
         "csr_str" "X" yank (t) dup UDEF <>
         IF (t) tLatest >=
            IF "Start time cannot be in the future" . nl
               UDEF "csr_str" "X" bank
            THEN
         ELSE (t) drop
         THEN

         "csr_str" "X" yank UDEF = (f1)
         "replay_init" "t1" yank UDEF = (f2)
         (f1 f2) and
         IF "Left click on a start point > " . 0 
         ELSE "Left click on a start point or Esc to replay > " . -1 
         THEN (f)

         (f) nc ESC? and (f)
      UNTIL
      "csr_str" "X" yank dup UDEF = 
      IF drop "replay_init" "t1" yank THEN (tg) tm @ (t)

    \ Set index for t in once-a-day data:
      (t) dup replay_set_date

      COLS cr0 "replay_start_time: start time is " . (t) dup ctime . nl

      ERR
   end

   inline: replay_step (n --- ) \ graph P(t) on step n
{     Thu Jul 15 14:29:03 PDT 2010

      The phrases of this word contain all the steps from taking col-
      lected data to graphing the model curves.  It works by setting 
      up arrays for replay just like real time and then running the 
      actual real time words and macos for an accurate simulation.

      Some inline phrases of real time word daysget() were made into
      macros for the purpose of replay, and are called daysget.extend
      and daysget.add_model.  Now they are used both for real time and 
      for replay.

      The current step is at row Rstep = Rstart+n, where Rstep and
      Rstart are shown below.

      Step count n starts at zero, which corresponds to the replay
      start time, called Rstart below.  Each step of one row is 
      SEC seconds (probably 180 seconds; see word rdech()).

      The graph of P(t) made by this word is for the entire timeline 
      shown below, from Earliest to Latest.  It falls short of the 
      real time width of the window since the replay is for a period
      in the past.

      How times align (names for row numbers are shown; adjacent rows
      are separated by SEC seconds):

         Earliest   Rsess     Rstart    Rstep       Rend2     Latest
         |          |         |         |           |              |
                   |          |         |
               Rend1          n=0       n

         Earliest is the very first time point in all the data.
         Rend1 is the session end preceding the one being replayed.
         Rsess is start of the session being replayed.
         Rstart is start of replay.
         Rstep is the current step being played; Rstep = Rstart+n. 
         Rend2 is the end of the session being replayed.
         Latest is the end of NADD future sessions (probably two) 
            that follow Rend2.

      Note that rows Rstep+1 through Latest are in the future, just 
      as in rightmost regions of real time graphs.

      Timing Tue Jul 20 08:20:02 PDT 2010.
      The following shows typical elapsed times in microseconds when 
      this word runs.  Drawing the plot (tplot) takes the most time.

         A period near the present, at the right side of the graph:
            EUU10 12967 08:11 CDT Mon Jul 19, 2010
            replay keys: Esc r h j k l  Top of replay_step 2809641 
             replay_step: dayget session 3577 
             replay_step: append to Earliest 15493 
             replay_step: daysget curves done 208370 
             replay_step: tgraph 67850 
             replay_step: tplot 1363737 

         A period one month earlier, near the middle of the graph:
            EUU10 11921 11:00 CDT Wed Jun 9, 2010
            replay keys: Esc r h j k l  Top of replay_step 51147349 
             replay_step: dayget session 5089 
             replay_step: append to Earliest 19330 
             replay_step: daysget curves done 180447 
             replay_step: tgraph 40775 
             replay_step: tplot 776688 

         A period two months earlier, near the left side of the graph:
            EUU10 12667 19:33 CDT Wed May 12, 2010
            replay keys: Esc r h j k l  Top of replay_step 490495 
             replay_step: dayget session 3586 
             replay_step: append to Earliest 3533 
             replay_step: daysget curves done 119487 
             replay_step: tgraph 21238 
             replay_step: tplot 267129 

      In running tplot, the earliest period is about five times faster 
      than the recent one, which makes a huge difference in the look 
      and feel.  

      These cases show differences in tplot times that are proportional
      to the number of time points to reach the period shown (from right
      to middle to left), but all cases use zoomed windows with about 
      the same number of time points showing.  This suggests that there
      is something within plotting that needs to be sped up.

      Timing update Tue Jul 20 11:50:27 PDT 2010.  
      Investigating possible speed issues with zoomed windows, it was 
      found that when graphing lines in a clip region, all the line
      segments must be scanned because you never know ahead of time
      when one goes outside the region and then comes back in (linec() 
      in src/xterm.c, using function bounds() that looks at X and Y 
      pairs of all line segments).

      But a little reflection shows that this is needed only for general
      X, Y graphs.  When graphing traces (time histories), only Y values
      can go outside the clip region and then come back in.  The region
      for ordered X values is simple, and no X values go outside the 
      clip region and then come back in.  Ahead of time the rows of X 
      and Y for graphing can be restricted to just those that are en-
      closed by the X-clip region (the earliest and latest times) and
      then the clipping function needs only to look at clipping in Y.

      New C function linet() was written in file xterm.c for the special
      case of traces (where X contains times), and its use is called 
      out when mobius.n sets up the window control block (wcb) for the
      electronic console graphs (using added wcb item "typ" made in 
      xterm.v and term.h and applied in word pExpose() in plot.v to use
      either general linec() (when wcb.typ is undefined) or new linet()
      for traces (when wcb.typ=1)).

      Here is rerunning the worst performing period using new linet():

         A period near the present, at the right side of the graph:
            EUU10 12959 11:39 CDT Mon Jul 19, 2010
            replay keys: Esc r h j k l  Top of replay_step 854678 
             replay_step: dayget session 3577 
             replay_step: append to Earliest 16588 
             replay_step: daysget curves done 203757 
             replay_step: tgraph 69715 
             replay_step: tplot 347097 

      Elapsed time for tplot of 347 msec is about 4 times faster than
      previous 1364 msec, and is on the order of 267 msec for the pre-
      vious best performing case.  Speedy look and feel will now be
      there for all zoomed graphs.  My strained eyes from watching slow
      and jerky graph animations feel better already.
}
      [ "rdecimate" "SEC" yank "SEC" book \ sample rate

        UDEF "Rstart" book \ to be banked here by replay_init()
        UDEF "Rend1" book  \ to be banked here by replay_init()
        UDEF "Rend2" book  \ to be banked here by replay_init()
        yes "SHOW" book    \ display graph

        "'replay_start_time' 'tnow' yank" "tnow" macro
      ]
\ " Top of replay_step" . timeprobe nl

      "replay_step part A" ERRset

      no "DObeep" book \ yes to beep after tplot

      0 max (n) "n" book

      Rend1 1+ "Rsess" book
      Rstart n + "Rstep" book 

    \ Cap Rstep if stepping has reached the present time:
      Rstep tnow >=
      IF \ stepping has reached the present time:
       \ Trim REPLAY.NSTEP to the max it can be:
         "REPLAY" "NSTEP" yank (NSTEP1)
         Rstep tnow - (delta)
         (NSTEP1 delta) - (NSTEP)
         (NSTEP) "REPLAY" "NSTEP" bank
         tnow "Rstep" book
         yes "DObeep" book
      THEN

      ERR \ "replay_step part A" ERRset

    \ Reinitialize if stepping moves into the next session:
      Rstep Rend2 >
      IF replay_tvec Rend2 1+ pry (tg) \ next session start time
         (tg) tm dup replay_set_date   \ index for once-a-day data 
         (tm) replay_init              \ initialize next session
         0 "REPLAY" "NSTEP" bank       \ reset step counter
         0 replay_step                 \ reenter this word
         return
      THEN

      "replay_step part B" ERRset

    \ Fetch data from Rsess to Rstep:
      "replay_init" "Psess" yank (hP) \ data from Rsess to Rend2

    \ Start with just the columns from real time data collection:
    \ (hP) 1st .dC items catch (hP) \ real time prices: H, L, C, dC

    \ Thu Dec  1 14:56:52 PST 2011.  Real time data collection columns
    \ have be extended from four to seven: H, L, C, dC, VO, IT, SE
      (hP) 1st .SE items catch (hP) \ H, L, C, dC, VO, IT, SE

    \ Chop the rows of P to the right time span, Pstep, and make a 
    \ vector of relative times t1 for steps of SEC:
      1st Rstep Rsess - 1+ items (hRows)
      (hP hRows) reach (hPstep)   \ data from Rsess to Rstep
      SEC over rows uniform (ht1) \ relative times, Rsess to Rstep

    \ Using data from Rsess to Rstep, macro dayget.extend builds entire
    \ session matrices P and t going from Rsess to Rend2, including fu-
    \ ture points from Rstep+1 to Rend2; discard the time vector t:
      (hPsess ht1) "dayget" "extend" localrun (hP ht) drop (hP)

\ " replay_step: dayget session" . timeprobe nl

    \ Append P to matrix PZ before the session:
      "replay_init" "PZmat" localrun (hPZ) swap (hPZ hP) pile (hP)

    \ Wed Jan 26 15:48:29 PST 2011.  From actual times, make a time 
    \ vector from Earliest to Latest.  It includes times for rows of 
    \ P from Earliest to Rend2 plus NADD future days of +rows each 
    \ from Rend2 to Latest that will be added to P by the loop in 
    \ macro daysget.add_model to be run:
      (hP) replay_tvec 1st other (hP) rows 
      "daysget" "+rows" yank "daysget" "NADD" yank * +
      items reach (hP ht)

    \ Before macro daysget.add_model can be run, arrays P and t are 
    \ banked into daysget(), where they are called P1 and t:
      (hP ht) "daysget" "t" bank "daysget" "P1" bank

    \ Set daysget.tnow to be for the row of Rstep rather than the row 
    \ for the present real time (words rnow() and sigmodel() use row 
    \ daysget.tnow):
      Rstep "daysget" "tnow" bank 

    \ Now macro daysget.add_model can be run; this is the big moment: 
    \ ta-da:
      "daysget" "add_model" localrun (hP)

\ " replay_step: daysget curves done" . timeprobe nl

      SHOW 
      IF
       \ Run tgraph to extract the curves to plot, then make the plot.
       \ Use the full time vector with correct times; even though it 
       \ has extra rows beyond replay, the plotter won't care:
         (hP) replay_tvec (hP ht) tgraph 

\ " replay_step: tgraph" . timeprobe nl

         tplot

\ " replay_step: tplot" . timeprobe nl

       \ Waiting until now to beep when new plot is visible:
         DObeep IF beep THEN

      ELSE (hP) drop
      THEN

      ERR \ "replay_step part B" ERRset
   end

   inline: replay_tvec ( --- ht) \ vector of all graph times
    \ Sun Nov  7 11:26:54 PST 2010

    \ Warning: always use main t vector during replay, since it has 
    \ been aligned for proper operation of date functions tg and tm 
    \ when it was originally set up in daysget().
      "t" main (ht)
   end

   private halt

\-----------------------------------------------------------------------

;  Appendix.

   Run time study.  June 2008.

   [Note June 2009: Memory in machine plunger for this study was only 
   128 Mb, when booking out of core kept memcat memory at 0.931 Mb.
   The machine has now been been upgraded to 773 Mb and memcat memory 
   has increased from 0.931 Mb to 8.40 Mb as matrices are no longer 
   booked out of core.  Performance is much better.]

   Below are some timings to shed light on why there is a long delay as
   the graph you are viewing gets updated.  There is no single smoking 
   gun, but time for actual drawing or redrawing of a graph (words in 
   plot.v) was found to be small compared to recalculation times and 
   need not be considered. 

   Instead, the machine really is being worked with the large matrices
   involved.  One matrix item (column) of data every three minutes for 
   30 days requires 13,800 terms for markets open 23 hours.  This swamps
   a 20 year database of 220 terms (days) per year, which only requires
   about 4400 terms per matrix item.

   It is remarkable that all this works in real time for 15 to 20 win-
   dows like the one analyzed here, all running at the same time, on a 
   machine with 128 Mb of ram and 256 Mb of scratch (the secret is queu-
   ing graph updates with qcon.v, and booking large matrices out-of-
   core so only an active window has its matrices in memory).

   There are terms in the struct of daysget.P that are being computed
   but are no longer being used.  Before trying to speed up the code,
   these should be removed to reduce run time and make P smaller.

   (Note: in January 2009, file mrtim.n noted below was replaced with 
   file mov.n; in April 2009, mov.n was replaced by movs.n.)

   [On May 11, 2010, all files in /opt/mytops/usr/archive were inadver-
   tently deleted, and mov.n, movs.n were lost.]

      [dale@plunger] /home/dale > tops
               Tops 3.0.1
      Mon Jun  9 12:21:54 PDT 2008
      [tops@plunger] ready > 'mrtim.n' psource

      [tops@plunger] ready > s 20l
      S real time
      Analyzing the last 30 days...
      top of daysget 1213039334917893 
      daysget end initialization branch 4964519  <<< 5 seconds
      daysget end fetch latest 75393 
      daysget end alerts + coladd 2130672  <<< 2 seconds (coladd)
      daysget done booking P1 and t 394114 
      daysget done fbooking P1 and t 298405 
      Automatic update is on; Esc unlocks the keyboard ...
      TGRAPH begin 2101152 <<< 2 seconds ???
      top of daysget 1246 
      daysget end saved data branch 2277698 <<< 2 seconds
      daysget end fetch latest 82062 
      daysget end alerts + coladd 3367734 <<< 3.3 seconds
      daysget done booking P1 and t 180097 
      daysget done fbooking P1 and t 490431 
      TGRAPH after daysget 1600 
      SX08 14750 14390 14470 74 13:45 CDT Mon Jun 9, 2008 (12:2 ...

   Times after this point involve plotting, and they are less than
   the big ones above.  Here is another one to give an idea of the
   variations the occur:

      [dale@plunger] /home/dale > tops
               Tops 3.0.1
      Mon Jun  9 12:24:10 PDT 2008
      [tops@plunger] ready > 'mrtim.n' psource

      [tops@plunger] ready > s 20l
      S real time
      Analyzing the last 30 days...
      top of daysget 1213039469993182 
      daysget end initialization branch 8912005  <<< 9 seconds
      daysget end fetch latest 41623 
      daysget end alerts 154940 
      daysget end coladd 1820764  <<< 2 seconds
      daysget done booking P1 and t 231143 
      daysget done fbooking P1 and t 178084 
      Automatic update is on; Esc unlocks the keyboard ...
      TGRAPH begin 2850338 <<< 3 seconds ???
      top of daysget 1242 
      daysget end saved data branch 789234 
      daysget end fetch latest 89746 
      daysget end alerts 168185 
      daysget end coladd 1712222 <<< 2 seconds
      daysget done booking P1 and t 186501 
      daysget done fbooking P1 and t 141551 
      TGRAPH after daysget 1850 
      SX08 14750 14390 14470 74 13:45 CDT Mon Jun 9, 2008 (12:2 ...

   Here is waking up (exposing a graph) showing the saved data branch 
   can sometimes take a while, as can fbooking P1 and t:

      auto loop key hit 1873599367 
      TGRAPH begin 209975 
      top of daysget 490428 
      daysget end saved data branch 5509854 <<< 6 seconds
      daysget end fetch latest 272554 
      daysget end alerts 118418 
      daysget end coladd 1251502 <<< 1.3 seconds
      daysget done booking P1 and t 190754 
      daysget done fbooking P1 and t 1001253 <<< 1 second
      TGRAPH after daysget 6334 
      SX08 14750 14390 14470 74 13:45 CDT Mon Jun 9, 2008 (12:5 ...

   The conclusion is that calculations in daysget and coladd, and fetch-
   ing and storing large fbooked files take time.  But that is the na-
   ture of things when big matrices with thousands of rows are involved
   and out-of-core booking (fbook) is required due to memory restric-
   tions.

   Here are the sizes of big, fbooked matrices for these timings, which
   are for market S.  They total to about 4.6 Mb:  
 
      [tops@plunger] ready > PZ P PG

       stack elements:
             0 matrix: PG  7978 by 17    (1.09 Mb)
             1 matrix: P  7978 by 34     (2.17 Mb)
             2 matrix: PZ  11524 by 15   (1.38 Mb)
       [3] ok!
      [tops@plunger] ready > 

   Market S is one of the smaller ones in terms of hours open.  Other 
   markets have number of matrix rows approximately 20% larger.

   Removing columns that are no longer used from these matrices could
   help by reducing calculation time and reducing out-of-core file size.

\-----------------------------------------------------------------------

   Words not used; obsolete:

  _inline: elec_open (qMkt -- nDT) \ open DT sec after session start
{     Electronic trading opens DT seconds after session start.

      Example: This shows that US trading opens 30 minutes after 
      session start:

         % "US" elec_open .i
          1800
         (Note: this was run before US electronic open was changed
          to open one-half hour earlier.)
}
      lowercase "_ele" + timeline 1st pry
      "timeline_add" "OFFSET" yank - \ remove imposed collector offset
   end

  _inline: hget (qFILE --- hD) \ matrix D from history FILE
{     This word reads the history file of a day's session that was
      saved by hist_add() in mget.v.

      FILE names saved by hist_add() are of the form
         1080120_US.bin, 1080121_HG.bin
      so they contain the date and the market symbol.

      Incoming FILE name must include the path.  Usually, history
      files are at epath.

      The six columns of returned matrix D contain the following:
         O, H, L, C, Chg, GMT

      Matrix D holds the values just as they were collected from
      different sources, and may contain errors and discrepancies. 
}
      "hget" ERRset

      dup file? 
      IF (qFile) old binary "BIN" file
         BIN dup fsize fget (hT)
         (hT) PDP_ENDIAN import4 (hA)
         BIN fclose

         (hA) dup rows 6 / matrix
         (hA) yes 6 sorton

{ THIS IS JUST TOO TIME CONSUMING DUE TO WORDS noq_alike and format.  
  LET DECIMATION SORT IT ALL OUT IN rtget (WORD rdecimate).
         (hA) "%0.0f " 6 cats format (hT)
         (hT) noq_alike \ no rows with all 6 columns identical
         (hT) 6 matread (hA)
-------- }

         (hA) "_hget" naming (hD)
      ELSE drop purged
      THEN

      ERR
   end

--------------- }
{
   The infix version of hget() below causes problems when run remotely 
   in the multitasker on a networked window, another lesson that infix 
   words really are not meant for words that must work over networks.

   The problem is believed to be due to words like return() and re-
   turn2() in the PRE portion of an infix function, but further work
   is needed to pin down the exact cause, and there is currently no 
   pressing need to run infix functions over a network. 

   Infix hget() has been replaced by the postfix version above.  Now 
   that the two can be compared, it is not clear that the infix version
   is any easier to follow than the postfix version, due to nesting. 

   [Note, March 28, 2008: 

      Words like return and return2 are restricted when the program
      thinks it is running over a network.  A restricted word is one 
      that has an alternative word that net.v has made to replace it
      because it is not supposed to be run.  

      In exe.c, EXE_REMOTE() runs restricted word alternatives if 
      SOCKFD!=-1, assuming it is on a network.  For instance, "HALT"
      is a restricted word and its alternative is a word that displays
      the message "invalid word on remote."

      This has been now been relaxed to not run the restricted word's 
      alternative if the program is running an interactive keyboard, 
      and so alleviates the problem of running parsed infix functions 
      over a network, but only on the end is that is interactive.

      The problem remains for two daemons, for instance, where there is
      no keyboard on either end.

      [April 11, 2008: relaxing the restriction was a bad idea.  You 
      never want a remote running restricted functions even if it has 
      a keyboard.  The change of March 28 to not run alternatives of 
      restricted words has been removed, and the system is back to the 
      way it was.  Expect to see problems again, and think about a bet-
      ter solution.]

      [April 22, 2008: return2 installed by the parser has been replaced
      by return2*, a word just for the parser which in word.p points to
      return2 so it does exactly what return2 does.  

      It was parsed words (infix functions) that used return2 and trig-
      gered the March 28 attempt to fix things which was undone on April
      11.  

      This should get around the restriction on return2 and work ok on 
      an instance that is running remotely.  

      This took nearly a month to sort out.  It is amazing how long some
      problems take to work out.  This is only one of countless problems
      worked out since this program began in 1999, and is why it gets 
      better and better all the time.  

      This is stuff no one thinks about when contemplating a computer 
      application--everything works fine in our minds and things like 
      this never arise.  A program that has been used and upgraded for
      years, like this one, can be very robust.]

      Infix functions are great for nitty gritty matrix expressions, and
      what makes them fail on networks are features that are desirable
      in their domain.  

      Functions that run across networks do not need nitty gritty matrix
      expressions, and they should be postfix.
   ]
{"
  _function (D) = hget(FILE) { // matrix D from history FILE
   /* This word reads the history file of a day's session that was
      saved by hist_add() in mget.v.

      FILE names saved by hist_add() are of the form
         1080120_US.bin, 1080121_HG.bin
      so they contain the date and the market symbol.

      Incoming FILE name must include the path.  Usually, history
      files are at mpath.

      The six columns of returned matrix D contain the following:
         O, H, L, C, Chg, GMT 
 
      Matrix D holds the values just as they were collected from
      different sources, and may contain errors and discrepancies. */

      if(!file?(FILE)) return(purged);
      
      file(binary, old(FILE), "BIN");
      D = import4(fget(BIN, fsize(BIN)), PDP_ENDIAN);
      fclose(BIN);

      D = naming(
             matread(                       // read six column MAT
                noq_alike(                  // remove identical lines
                format(                     // format MAT to VOL (text)
                   sorton(                  // sort on GMT
                      matrix(D, rows(D)/6), // vector into matrix
                   yes, 6),
                cats("%12.0f", 6))),
             6),
          "_hget");
   }
"} eval

  _inline: hmchg (qMKT --- f) \ true if history for MKT has changed
{     Running in real time as data is being collected, look in the
      directories where collectors write files to see if the latest
      file for MKT has changed in any one of them.  If so, return a
      true flag.
}
      [ "hdchg" "PATHS" yank "PATHS" book 
        tracklist vol2mat bend yes sort "MKTS" book
        tracklist rows 1 null "TIMES" book
      ]
      (qMKT) uppercase PATHS over hist_fname tail filetime totals @ (nT)

      (qMKT nT) MKTS rot (qMKT) tracklist chars blpad str2num (nMKT)
      (hMKTS nMKT) bsearch (r f)
      IF (nT r) "r" book 
         TIMES r pry (nT nT0)            

         (nT nT0) over <> 
         IF (nT) TIMES r poke true
         ELSE (nT) drop false
         THEN
         
      ELSE " hmchg: market not found" ersys false
      THEN
   end
 
  _define: mID (qMkt --- k) \ int id for Mkt, i.e., "W" mID (11)
      main push 4 dump pull (k) ;

  _inline: pit_open (qMkt -- nDT) \ pit opens DT sec after session start
{     DT is the number of seconds after session start (not electronic
      Mkt open) when the pits for Mkt will open.

      Example: This shows that SF pits opened today at 5:20 PDT
         % date . nl Mkt .
         Mon Mar 23 09:37:34 PDT 2009
         SF
         % date sysdate drop session_start Mkt pit_open + ctime .
         Mon Mar 23 05:20:00 PDT 2009
         %
}
      lowercase "_pit" + timeline 1st pry
      "timeline_add" "OFFSET" yank - \ remove collector imposed offset
   end

  _define: signs (hA --- hS) \ mat of +1, -1 signs of mat A elements
      0< dup 0= abs plus ;

  _inline: TR (hU --- hA) \ compute traces A of vector U
\     Trace periods in A range from 30 minutes to 8 days.
      [ "rdecimate" "SEC" yank 60 / "tau" book \ step size (minutes)
 
      \ Steps for traces of 30 minutes to 8 hours (480 minutes), and
      \ one-half day (720 minutes) to 8 days (11520 minutes):
        list: \ number of minutes
           30  60   120  240  480   \ one-half hour to 8 hours
           720 1440 2880 5760 11520 \ one-half day to 8 days
        end
        tau / "N" book \ steps
{
        N defines the number of samples to use for a trace so that the 
        trace corresponds roughly to the number of minutes given above 
        when we gloss over the time periods that the market is closed.

        Compared to moving average (word ma), exponential trace used 
        here (word tr) is much smoother for the same number of steps N,
        and makes more uniform patterns that seem better for trading.

        To compare the difference, replace the line below with this one:
           DO peek N I pry ma LOOP N rows parkn pull drop

        Traces merge easily into a dynamic model.  A trace can be a 
        state, and the discrete dynamic equation for the state, which
        follows simply from the definition of tr (see man tr or the 
        code for tr() in wapp.c), is

           Tr(k) = A*Tr(k-1) + B*U(k)

        where Tr(k) is the state at step k which depends only upon the 
        previous state and the value of U at step k.  Constants A and B
        are related to the time constant:

           lambda = 1/(1+n) (a number less than 1)

           A = 1 - lambda
           B = lambda
   
        where n is one of the values from N computed above.
}       
      ] 
      (hA) push N rows 1st
      DO peek N I pry tr LOOP N rows parkn pull drop
   end

\-----------------------------------------------------------------------

   Work weeks per year.
{
   This has absolutely nothing to do with actual yearly files saved.
   Files saved match weekday dates returned by word ydays.

   What's the maximum number of weekdays (M-F) in any nearby year?

   "weekdays" missing IF cal.v source THEN

   0 is wdmax
   public
   "130 77 DO I weekdays wdmax max into wdmax LOOP" "wd" inlinex
   private
   " Max weekdays per year from 1977 to 2030 is:" . wd wdmax .i

   Here's the result:
      Max weekdays per year from 1977 to 2030 is: 262

   An upper bound without all this work would be to reason that 
   there are less than 53 full weeks per year, so there can't be 
   more than 53*5 = 265 week days.

   For database, use 262 days.  There are really fewer because of 
   holidays.  

   Work weeks per year = 262/5 = 52.4; 52.4 blocks of five days, 
   some with 4 work days.
}  


