Table of Contents
-----------------
1. Quick minimal description of Argile language
2. Compilation Process
3. Types
4. Other features
5. Examples


1. Quick and minimal description of the Argile programming language
-------------------------------------------------------------------

An argile program is structured as a list of calls to definitions;
since there is no keywords, everything must be defined. This list
of calls can be separated by semi-colons (;) or new lines, example:

call_1 ; call_2
call_3
call_4;

Comments are delimited within parenthesis and colons, as follows:
   (: inside a comment :)
   (: a (:nested:) comment :)

A definition can be a variable, a type, a function, a macro
or a binding in a shared object module.

A call is a list of call element, and a call element can be a
constant literal, a word, an operator, or a subcall.

A word is a set of "letter" characters (a-z A-Z _ 0-9 and non ascii characters)

An operator is one of the special operator characters
( !#$%&'*+,-./<=>?@[\]^`|~ ) however the backslash \ can be used at end of line
to avoid automatic termination of a call. More operators can be added
with the ARGILE_OPERATORS environment variable (see `arc --env-help').
The distinction between words and operators is to avoid the need to put spaces
to separate them (you still need to put a space between two words).

An explicit subcall is a call inside parenthesis. Example:

  print (x) (y)

Note that the compiler can understand implicit subcalls with a recursive
algorithm (detailed in section Compilation Process), but then it takes more
time to compile and has a (configurable) recursion depth limit.

A constant literal can be an integer, a real number, a text character string,
a block of code, or a syntax literal.

An integer can be decimal (12), hexadecimal (0x3c), octal (064) or
binary (0b11010).

A real number is like a double in the C language (e.g. 31.4159e-1) :
it is a relative integer, then a dot, then a natural integer, optionnally
followed by an exponent which is the 'e' or 'E' character followed by a
relative integer.

A text character string is delimited with double quotes ("some text...").

A block of code contains a list of calls and can be delimited
by braces ( {} ), or be guessed from indentation. Examples:

{} {;;;} (: two valid empty blocks :)
a {b; c; d} {
  e
    f   (: forth indentation opens a new block of code in current call :)
   g	(: half-back indentation closes block but continues current call :)
    h   (: forth indentation opens a new block of code in current call :)
  i     (: back indentation closes block and terminates previous call :)
}

which is equivalent to:

{}{} ; a {b;c;d} { e{f}g{h} ; i }


A syntax literal is delimited with colons `:' and is a list of
syntax elements; a syntax element can be a word, an operator, a parameter,
an option (containing a sub list of syntax elements), an enumeration
(a list of list of syntax elements) or a repeated list (list of elements
that can be repeated a number of times, with optional boundaries).
Examples:

:foo bar +-! / <int x> (optional param <text t = "default">):
(: this syntax has two words, followed by four operators, then a parameter
   followed by an option containing two words and a text parameter with
   a default value :)

:enumeration { A | B | C+D <real r> }:
:a plus or a pipe { + | \| }:     (: pipe must be escaped inside enumeration :)
:zero or more [xyz...]:
:at least three bees [B ... 3,]:
: five, six or seven Ys [Y (,) ... 5,7] :

Note that paramters (delimited with <>) have a type
(call to a definition), an optional name (word) and an optional default
value ( <int x = 3> ) which can be a constant literal or an explicit
subcall ( <xtype x = (foo)> ).
When no name is provided, the type of the parameter is used instead.
It is also possible to specify several
parameters inside a single pair of <> as in:

	:f <int x=1, int y, int z>:

wich is equivalent to:

	:f <int x=1> <int y> <int z>:


2. Compilation Process
----------------------

Several things happen at compilation; in order:

2.1. Memory initialized from command line options and environnment variables.

2.2. Standard pseudo module `std' is initialized

2.3. Implicit bindings to std module are defined
   :use [{<word>|<text>}, ...] {<word>|<text>}: is bound to std/use
   :bind <syntax> to <word module>/<word bind>: is bound to std/bind
(the types of parameters are only temporary bound for these 2 definitions)
The first one (use) is used to import an argile source file;
the second one (bind) is then used to make the bindings in the std module.

2.4. Input argile source code is read (from file or command line or stdin)

2.5. Lexical analysis (lexer) to convert a stream of characters into
  a stream of token (which is passed to the parser).

2.6. Syntactic analysis (parser) to convert a stream of token into
  an internal object tree srtructure

2.7. Compilation:

  The main code is compiled as a block of code with an expected return value.

  A code block is compiled in two passes: one considering only definitions
  that may produce more definitions (defmakers) and a second one considering
  all definitions.

  Each pass is a loop on all calls as long as some call is successfully
  compiled.

  When all calls of the code block are compiled, all sub code block literals
  are recursively compiled.

  A single call is compiled with an expected return type (which is most of the
  time the special `anything' type) :
    for each possible decreasing length of implicit subcalls
      for each definitions in the `definitions order' (see below)
        for each possible position of the implicit subcall (left to right)
	  try to match the syntax of the definition with the subcall
	  if it matched
	    if it is not really a subcall but the whole call
	      compile explicit subcalls (in parenthesis)
	      check return type matches expected context type
	      check if it is a binding that rejects itself
	    else
	      make it a real subcall (marked as implicit)
	      compile explicit subcalls (in parenthesis in source)
	      check if it is a binding that rejects itself
	      recursively retry to compile the whole call
	      if it didn't work, undo the implicit subcall

    The `definitions order' refers to the order in which definitions
    are searched; it starts from a code position (a call within a code block):
      for each block from the current one to the upper ones
        for each definitions made before the current call in this block
        for each definitions made after the current call in this block

    A call matching a syntax is a bit like a text string matching a regular
    expression, except the terminating element is not a character but
    either a word, an operator or a typed parameter.

  To finish compilation of a single call, bindings involved are then compiled,
  using their custom native code.

  Note that the previous recursives algorithms have recursion limits
  configurable with options and environment variables
  (see `arc --help' and `arc --env-help' options)

  The consequence of this compilation process is that the priority
  is the longest first, then the closest in scope, then from left to right.

2.8. C code is generated
  This generates C code for prototypes of functions, global variables,
  functions bodies, and the main function (may be controlled by command
  line options).

3.Types
-------

A type can be a basic type (nothing, anything, type, word, integer, natural,
real, text, syntax, or code), a class, an union, an enumeration, or a
C-type (which has prefix and suffix text literals associated).
There also exist C-type generators, which are macros returning a type
(which may depend on the value of their parameters).
It is also possible to define automatic type casters that implicitely convert
a return type of a definition to another so it can match the expected type.

3.1. References
  A reference is a typed value that can be written to, or read from.
  Variables are actually references. Modifying the value of a parameter
  with a reference type will change the value for the caller as well,
  but by default, parameters are passed by value. The binding std/typeref
  can be used to get the reference version of a type. For example:

    use std
    .: func <int & i> :. {i *= 3}
    let int x = 2
    func x
    print x (: prints 6 :)

3.2. Raw types
  Raw types are for class types, it means `by value' instead of `by address'.
  A variable with a raw class type will be allocated on the stack.
  The binding std/typeraw can be used to get the raw version of a type.

    use std
    class C {int i}
    let C  c_adr = new C (: by address   :)
    let C@ c_raw         (: by raw value :)
    prints c_adr.i c_raw.i
    del c_adr

4.Other features
----------------

4.1. Sub-functions
  It is possible to define a local function inside another function,
  for example:

    use std
    .: some function :.
      sub-function (: calling the sub-function :)
      .: sub-function :. {}
    some function
    (: here we cannot call :sub-function: since it is
       not visible outside :some function: :)

4.2. Auto parameters
  When a sub-function calls variables from its upper function, they
  are implicitely passed as auto parameters; example:

    use std
    .:func <int x>:. {
      let var = 2
      .:subfunc <int y>:. {
        var++
	x++
      }
      (: here, var == 2 :)
      subfunc 42
      (: here, var == 3 :)
    }

4.3. C-like Variadism
  If a function syntax has the empty list :[...]: , then it means
  :[<any>...]: but with C-like variadism instead of Argile variadism;
  also, it can only be put at the end of a syntax.

5.Examples
----------

5.1. Hello world using std.arg

  use std
  print "hello, world!"
  printf "Hello world\n"
  echo Hello World

5.2. Hello world without using std.arg

  print "Hello, world!"
  bind :print [<anything> ... 1,]: to std/print
  bind :anything: to std/anything
  (: std is the module, and print and anything are bindings names :)

5.3. Overloading and definitions order

  use std;
  print x
  let x = 0
  print x
  let x = 1
  let x = 2
  print x
  let x = 3

  (: this will print:
     0
     0
     2
  :)

5.4. Arithmetics

  use std
  (: arith. operators macros use a union, so use int/nat explicitely :)
  let int x = 1 + 3 * 4 / 2     (: 1 + ((3 * 4) / 2) :)
  (: operators priority results from their order in std.arg :)

  print x     (: prints 7 :)

  let int y = (x * 2 + 3)
  (:
     (x) * (2 + 3) !!!! (x) is detected after (2+3) since
     it is shorter, and constant literals are immediately detected as such;
     so when doing arithmetics on non-constant, adding some parenthesis is
     good practice.
  :)
  let int y = (x * 2) + 3

5.5. Functions and Macros

  use std
  .:first function:. -> nat {
    let nat N = 13
    sub function
    sub function
    if N == 13
      return 1
    .:sub function:.
      a macro N
    =:a macro <nat &x>:= {x *= 2}
    N
  }
  print first function
  (: will print 52 :)

5.6. Class Types

  use std
  class	Foobar
    int    i
    nat    n
    text   t
    Foobar f
  let Foobar@ F
  F.t = "some foobar text"
  prints (F.i) (F.t) (F.f)

  class DummyInheritsFoobar <- Foobar
    real r
  let D = new DummyInheritsFoobar
  prints D.r D.i
  del D

----------------------------------------------------------------------

Copyright (C) 2009,2010,2011,2014 The Argile authors (See file AUTHORS).

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
copy of the license is included in the file doc/COPYING distributed
along with the Argile compiler.
