


rrdtool                                           CDEFTUTORIAL(1)



NNNNAAAAMMMMEEEE
     cdeftutorial - Alex van den Bogaerdt's CDEF tutorial

DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
     YYYYoooouuuu pppprrrroooovvvviiiiddddeeee aaaa qqqquuuueeeessssttttiiiioooonnnn aaaannnndddd IIII wwwwiiiillllllll ttttrrrryyyy ttttoooo pppprrrroooovvvviiiiddddeeee aaaannnn aaaannnnsssswwwweeeerrrr
     iiiinnnn tttthhhheeee nnnneeeexxxxtttt rrrreeeelllleeeeaaaasssseeee. NNNNoooo ffffeeeeeeeeddddbbbbaaaacccckkkk eeeeqqqquuuuaaaallllssss nnnnoooo cccchhhhaaaannnnggggeeeessss!!!!

     _A_d_d_i_t_i_o_n_s _t_o _t_h_i_s _d_o_c_u_m_e_n_t _a_r_e _a_l_s_o _w_e_l_c_o_m_e_.

     Alex van den Bogaerdt <alex@ergens.op.het.net>

     WWWWhhhhyyyy tttthhhhiiiissss ttttuuuuttttoooorrrriiiiaaaallll ????

     One of the powerful parts of RRDtool is its ability to do
     all sorts of calculations on the data retrieved from it's
     databases. However RRDtool's many options and syntax make it
     difficult for the average user to understand. The manuals
     are good at explaining what these options do; however they
     do not (and should not) explain in detail why they are
     useful. As with my RRDtool tutorial: if you want a simple
     document in simple language you should read this tutorial.
     If you are happy with the official documentation, you may
     find this document too simple or even boring. If you do
     choose to read this tutorial, I also expect you to have read
     and fully understand my other tutorial.

     MMMMoooorrrreeee rrrreeeeaaaaddddiiiinnnngggg

     If you have difficulties with the way I try to explain them
     please read Steve Rader's the rpntutorial manpage. It may
     help you understand how this all works.

WWWWhhhhaaaatttt aaaarrrreeee CCCCDDDDEEEEFFFFssss ????
     When retrieving data from an RRD, you are using a "DEF" to
     work with that data. Think of it as a variable that changes
     over time (where time is the x-axis). The value of this
     variable is what is found in the database at that particular
     time and you can't do any modifications on it. This is what
     CDEFs are for: they takes values from DEFs and perform
     calculations on them.

SSSSyyyynnnnttttaaaaxxxx
        DEF:var_name_1=some.rrd:ds_name:CF
        CDEF:var_name_2=RPN_expression

     You first define "var_name_1" to be data collected from data
     source "ds_name" found in RRD "some.rrd" with consolidation
     function "CF".

     Assume the ifInOctets SNMP counter is saved in mrtg.rrd as
     the DS "in".  Then the following DEF defines a variable for
     the average of that data source:



2001-02-20             Last change: 1.0.33                      1






rrdtool                                           CDEFTUTORIAL(1)



        DEF:inbytes=mrtg.rrd:in:AVERAGE

     Say you want to display bits per second (instead of bytes
     per second as stored in the database.)  You have to define a
     calculation (hence "CDEF") on variable "inbytes" and use
     that variable (inbits) instead of the original:

        CDEF:inbits=inbytes,8,*

     It tells to multiply inbytes by eight to get inbits. I'll
     explain later how this works. In the graphing or printing
     functions, you can now use inbits where you would use
     inbytes otherwise.

     Note that variable in the CDEF (inbits) must not be the same
     as the variable (inbytes) in the DEF!

RRRRPPPPNNNN----eeeexxxxpppprrrreeeessssssssiiiioooonnnnssss
     RPN is short-hand for Reverse Polish Notation. It works as
     follows.  You put the variables or numbers on a stack. You
     also put operations (things-to-do) on the stack and this
     stack is then processed. The result will be placed on the
     stack. At the end, there should be exactly one number left:
     the outcome of the series of operations. If there is not
     exactly one number left, rrdtool will complain loudly.

     Above multiplication by eight will look like:

     1.  Start with an empty stack

     2.  Put the content of variable inbytes on the stack

     3.  Put the number eight on the stack

     4.  Put the operation multiply on the stack

     5.  Process the stack

     6.  Retrieve the value from the stack and put it in variable
         inbits

     We will now do an example with real numbers. Suppose the
     variable inbytes would have value 10, the stack would be:

     1.  ||

     2.  |10|

     3.  |10|8|

     4.  |10|8|*|




2001-02-20             Last change: 1.0.33                      2






rrdtool                                           CDEFTUTORIAL(1)



     5.  |80|

     6.  ||

     Processing the stack (step 5) will retrieve one value from
     the stack (from the right at step 4). This is the operation
     multiply and this takes two values off the stack as input.
     The result is put back on the stack (the value 80 in this
     case). For multiplication the order doesn't matter but for
     other operations like subtraction and division it does.
     Generally speaking you have the following order:

        y = A - B  -->  y=minus(A,B)  -->  CDEF:y=A,B,-

     This is not very intuitive (at least most people don't think
     so). For the function f(A,B) you reverse the position of "f"
     but you do not reverse the order of the variables.

CCCCoooonnnnvvvveeeerrrrttttiiiinnnngggg yyyyoooouuuurrrr wwwwiiiisssshhhheeeessss ttttoooo RRRRPPPPNNNN
     First, get a clear picture of what you want to do. Break
     down the problem in smaller portions until they cannot be
     split anymore. Then it is rather simple to convert your
     ideas into RPN.

     Suppose you have several RRDs and would like to add up some
     counters in them. These could be, for instance, the counters
     for every WAN link you are monitoring.

     You have:

        router1.rrd with link1in link2in
        router2.rrd with link1in link2in
        router3.rrd with link1in link2in

     Suppose you would like to add up all these counters, except
     for link2in inside router2.rrd. You need to do:

     (in this example, "router1.rrd:link1in" means the DS link1in
     inside the RRD router1.rrd)

        router1.rrd:link1in
        router1.rrd:link2in
        router2.rrd:link1in
        router3.rrd:link1in
        router3.rrd:link2in
        --------------------   +
        (outcome of the sum)

     As a mathmatical function, this could be written:

     `add(router1.rrd:link1in , router1.rrd:link2in ,
     router2.rrd:link1in , router3.rrd:link1in ,



2001-02-20             Last change: 1.0.33                      3






rrdtool                                           CDEFTUTORIAL(1)



     router3.rrd:link2.in)'

     With RRDtool and RPN, first, define the inputs:

        DEF:a=router1.rrd:link1in:AVERAGE
        DEF:b=router1.rrd:link2in:AVERAGE
        DEF:c=router2.rrd:link1in:AVERAGE
        DEF:d=router3.rrd:link1in:AVERAGE
        DEF:e=router3.rrd:link2in:AVERAGE

     Now, the mathematical function becomes: `add(a,b,c,d,e)'

     In RPN, there's no operator that sums more than two values
     so you need to do several additions. You add a and b, add c
     to the result, add d to the result and add e to the result.

        push a:         a     stack contains the value of a
        push b and add: b,+   stack contains the result of a+b
        push c and add: c,+   stack contains the result of a+b+c
        push d and add: d,+   stack contains the result of a+b+c+d
        push e and add: e,+   stack contains the result of a+b+c+d+e

     What was calculated here would be written down as:

        ( ( ( (a+b) + c) + d) + e) >

     This is in RPN:  `CDEF:result=a,b,+,c,+,d,+,e,+'

     This is correct but it can be made more clear to humans. It
     does not matter if you add a to b and then add c to the
     result or first add b to c and then add a to the result.
     This makes it possible to rewrite the RPN into
     `CDEF:result=a,b,c,d,e,+,+,+,+' which is evaluatated
     differently:

        push value of variable a on the stack: a
        push value of variable b on the stack: a b
        push value of variable c on the stack: a b c
        push value of variable d on the stack: a b c d
        push value of variable e on the stack: a b c d e
        push operator + on the stack:          a b c d e +
        and process it:                        a b c P   (where P == d+e)
        push operator + on the stack:          a b c P +
        and process it:                        a b Q     (where Q == c+P)
        push operator + on the stack:          a b Q +
        and process it:                        a R       (where R == b+Q)
        push operator + on the stack:          a R +
        and process it:                        S         (where S == a+R)

     As you can see the RPN expression `a,b,c,d,e,+,+,+,+,+' will
     evaluate in `((((d+e)+c)+b)+a)' and it has the same outcome
     as `a,b,+,c,+,d,+,e,+' According to Steve Rader this is



2001-02-20             Last change: 1.0.33                      4






rrdtool                                           CDEFTUTORIAL(1)



     called the commutative law of addition but you may forget
     this right away, as long as you remember what it represents.

     Now look at an expression that contains a multiplication:

     First in normal math: `let result = a+b*c'. In this case you
     can't choose the order yourself, you have to start with the
     multiplication and then add a to it. You may alter the
     position of b and c, you may not alter the position of a and
     b.

     You have to take this in consideration when converting this
     expression into RPN. Read it as: "Add the outcome of b*c to
     a" and then it is easy to write the RPN expression:
     `result=a,b,c,*,+' Another expression that would return the
     same: `result=b,c,*,a,+'

     In normal math, you may encounter something like "a*(b+c)"
     and this can also be converted into RPN. The parenthesis
     just tell you to first add b and c, and then multiply a with
     the result. Again, now it is easy to write it in RPN:
     `result=a,b,c,+,*'. Note that this is very similar to one of
     the expressions in the previous paragraph, only the
     multiplication and the addition changed places.

     When you have problems with RPN or when rrdtool is
     complaining, it's usually a Good Thing to write down the
     stack on a piece of paper and see what happens. Have the
     manual ready and pretend to be rrdtool.  Just do all the
     math by hand to see what happens, I'm sure this will solve
     most, if not all, problems you encounter.

SSSSoooommmmeeee ssssppppeeeecccciiiiaaaallll nnnnuuuummmmbbbbeeeerrrrssss
     TTTThhhheeee uuuunnnnkkkknnnnoooowwwwnnnn vvvvaaaalllluuuueeee

     Sometimes collecting your data will fail. This can be very
     common, especially when querying over busy links. RRDtool
     can be configured to allow for one (or even more) unknown
     value and calculate the missing update. You can, for
     instance, query your device every minute. This is creating
     one so called PDP or primary data point per minute. If you
     defined your RRD to contain an RRA that stores 5-minute
     values, you need five of those PDPs to create one CDP
     (consolidated data point).  These PDPs can become unknown in
     two cases:

     1.  The updates are too far apart. This is tuned using the
         "heartbeat" setting

     2.  The update was set to unknown on purpose by inserting no
         value (using the template option) or by using "U" as the
         value to insert.



2001-02-20             Last change: 1.0.33                      5






rrdtool                                           CDEFTUTORIAL(1)



     When a CDP is calculated, another mechanism determines if
     this CDP is valid or not. If there are too many PDPs
     unknown, the CDP is unknown as well.  This is determined by
     the xff factor. Please note that one unknown counter update
     can result in two unknown PDPs! If you only allow for one
     unknown PDP per CDP, this makes the CDP go unknown!

     Suppose the counter increments with one per second and you
     retrieve it every minute:

        counter value    resulting rate
        10000
        10060            1; (10060-10000)/60 == 1
        10120            1; (10120-10060)/60 == 1
        unknown          unknown; you don't know the last value
        10240            unknown; you don't know the previous value
        10300            1; (10300-10240)/60 == 1

     If the CDP was to be calculated from the last five updates,
     it would get two unknown PDPs and three known PDPs. If xff
     would have been set to 0.5 which by the way is a commonly
     used factor, the CDP would have a known value of 1. If xff
     would have been set to 0.2 then the resulting CDP would be
     unknown.

     You have to decide the proper values for heartbeat, number
     of PDPs per CDP and the xff factor. As you can see from the
     previous text they define the behavior of your RRA.

     WWWWoooorrrrkkkkiiiinnnngggg wwwwiiiitttthhhh uuuunnnnkkkknnnnoooowwwwnnnn ddddaaaattttaaaa iiiinnnn yyyyoooouuuurrrr ddddaaaattttaaaabbbbaaaasssseeee

     As you have read in the previous chapter, entries in an RRA
     can be set to the unknown value. If you do calculations with
     this type of value, the result has to be unknown too. This
     means that an expression such as `result=a,b,+' will be
     unknown if either a or b is unknown.  It would be wrong to
     just ignore the unknown value and return the value of the
     other parameter. By doing so, you would assume "unknown"
     means "zero" and this is not true.

     There has been a case where somebody was collecting data for
     over a year.  A new piece of equipment was installed, a new
     RRD was created and the scripts were changed to add a
     counter from the old database and a counter from the new
     database. The result was disappointing, a large part of the
     statistics seemed to have vanished mysteriously ...  They of
     course didn't, values from the old database (known values)
     were added to values from the new database (unknown values)
     and the result was unknown.

     In this case, it is fairly reasonable to use a CDEF that
     alters unknown data into zero. The counters of the device



2001-02-20             Last change: 1.0.33                      6






rrdtool                                           CDEFTUTORIAL(1)



     were unknown (after all, it wasn't installed yet!) but you
     know that the data rate through the device had to be zero
     (because of the same reason: it was not installed).

     There are some examples further on that make this change.

     IIIInnnnffffiiiinnnniiiittttyyyy

     Infinite data is another form of a special number. It cannot
     be graphed because by definition you would never reach the
     infinite value. You could think of positive and negative
     infinity (I'm not sure if mathematicians will agree)
     depending on the position relative to zero.

     RRDtool is capable of representing (-not- graphing!)
     infinity by stopping at its current maximum (for positive
     infinity) or minimum (for negative infinity) without knowing
     this maximum (minimum).

     Infinity in rrdtool is mostly used to draw an AREA without
     knowing its vertical dimensions. You can think of it as
     drawing an AREA with an infinite height and displaying only
     the part that is visible in the current graph. This is
     probably a good way to approximate infinity and it sure
     allows for some neat tricks. See below for examples.

     WWWWoooorrrrkkkkiiiinnnngggg wwwwiiiitttthhhh uuuunnnnkkkknnnnoooowwwwnnnn ddddaaaattttaaaa aaaannnndddd iiiinnnnffffiiiinnnniiiittttyyyy

     Sometimes you would like to discard unknown data and pretend
     it is zero (or any other value for that matter) and
     sometimes you would like to pretend that known data is
     unknown (to discard known-to-be-wrong data).  This is why
     CDEFs have support for unknown data. There are also examples
     available that show unknown data by using infinity.

SSSSoooommmmeeee eeeexxxxaaaammmmpppplllleeeessss
     EEEExxxxaaaammmmpppplllleeee:::: uuuussssiiiinnnngggg aaaa rrrreeeecccceeeennnnttttllllyyyy ccccrrrreeeeaaaatttteeeedddd RRRRRRRRDDDD

     You are keeping statistics on your router for over a year
     now. Recently you installed an extra router and you would
     like to show the combined throughput for these two devices.

     If you just add up the counters from router.rrd and
     router2.rrd, you will add known data (from router.rrd) to
     unknown data (from router2.rrd) for the bigger part of your
     stats. You could solve this in a few ways:

     +o   While creating the new database, fill it with zeros from
         the start to now.  You have to make the database start
         at or before the least recent time in the other
         database.




2001-02-20             Last change: 1.0.33                      7






rrdtool                                           CDEFTUTORIAL(1)



     +o   Alternately you could use CDEF and alter unknown data to
         zero.

     Both methods have their pros and cons. The first method is
     troublesome and if you want to do that you have to figure it
     out yourself. It is not possible to create a database filled
     with zeros, you have to put them in on purpose. Implementing
     the second method is described next:

     What we want is: "if the value is unknown, replace it with
     zero". This could be writte in pseudo-code as:  if (value is
     unknown) then (zero) else (value). When reading the rrdgraph
     manual you notice the "UN" function that returns zero or
     one. You also notice the "IF" function that takes zero or
     one as input.

     First look at the "IF" function. It takes three values from
     the stack, the first value is the decision point, the second
     value is returned to the stack if the evaluation is "true"
     and if not, the third value is returned to the stack. We
     want the "UN" function to decide what happens so we combine
     those two functions in one CDEF.

     Lets write down the two possible paths for the "IF"
     function:

        if true  return a
        if false return b

     In RPN:  `result=x,a,b,IF' where "x" is either true or
     false.

     Now we have to fill in "x", this should be the "(value is
     unknown)" part and this is in RPN:  `result=value,UN'

     We now combine them: `result=value,UN,a,b,IF' and when we
     fill in the appropriate things for "a" and "b" we're
     finished:

     `CDEF:result=value,UN,0,value,IF'

     You may want to read Steve Raders RPN guide if you have
     difficulties with the way I explained this last example.

     If you want to check this RPN expression, just mimic
     rrdtools behavior:

        For any known value, the expression evaluates as follows:
        CDEF:result=value,UN,0,value,IF  (value,UN) is not true so it becomes 0
        CDEF:result=0,0,value,IF         "IF" will return the 3rd value
        CDEF:result=value                The known value is returned




2001-02-20             Last change: 1.0.33                      8






rrdtool                                           CDEFTUTORIAL(1)



        For the unknown value, this happens:
        CDEF:result=value,UN,0,value,IF  (value,UN) is true so it becomes 1
        CDEF:result=1,0,value,IF         "IF" sees 1 and returns the 2nd value
        CDEF:result=0                    Zero is returned

     Of course, if you would like to see another value instead of
     zero, you can use that other value.

     Eventually, when all unknown data is removed from the RRD,
     you may want to remove this rule so that unknown data is
     properly displayed.

     EEEExxxxaaaammmmpppplllleeee:::: bbbbeeeetttttttteeeerrrr hhhhaaaannnnddddlllliiiinnnngggg ooooffff uuuunnnnkkkknnnnoooowwwwnnnn ddddaaaattttaaaa,,,, bbbbyyyy uuuussssiiiinnnngggg ttttiiiimmmmeeee

     Above example has one drawback. If you do log unknown data
     in your database after installing your new equipment, it
     will also be translated into zero and therefore you won't
     see that there was a problem. This is not good and what you
     really want to do is:

     +o   If there is unknown data, look at the time that this
         sample was taken

     +o   If the unknown value is before time xxx, make it zero

     +o   If it is after time xxx, leave it as unknown data

     This is doable: you can compare the time that the sample was
     taken to some known time. Assuming you started to monitor
     your device on Friday September 17, 00:35:57 MET DST.
     Translate this time in seconds since 1970-01-01 and it
     becomes 937521357. If you process unknown values that were
     received after this time, you want to leave them unknown and
     if they were "received" before this time, you want to
     translate them into zero (so you can effectively ignore them
     while adding them to your other routers counters).

     Translating Friday September 17, 00:35:57 MET DST into
     937521357 can be done by, for instance, using gnu date:

        date -d "19990917 00:35:57" +%s

     You could also dump the database and see where the data
     starts to be known. There are several other ways of doing
     this, just pick one.

     Now we have to create the magic that allows us to process
     unknown values different depending on the time that the
     sample was taken.  This is a three step process:

     1.  If the timestamp of the value is after 937521357, leave
         it as is



2001-02-20             Last change: 1.0.33                      9






rrdtool                                           CDEFTUTORIAL(1)



     2.  If the value is a known value, leave it as is

     3.  Change the unknown value into zero.

     Lets look at part one:

         if (true) return the original value

     We rewrite this:

         if (true) return "a"
         if (false) return "b"

     We need to calculate true or false from step 1. There is a
     function available that returns the timestamp for the
     current sample. It is called, how surprisingly, "TIME". This
     time has to be compared to a constant number, we need "GT".
     The output of "GT" is true or false and this is good input
     to "IF". We want "if (time > 937521357) then (return a) else
     (return b)".

     This process was already described toroughly in the previous
     chapter so lets do it quick:

        if (x) then a else b
           where x represents "time>937521357"
           where a represents the original value
           where b represents the outcome of the previous example

        time>937521357       --> TIME,937521357,GT

        if (x) then a else b --> x,a,b,IF
        substitute x         --> TIME,937521357,GT,a,b,IF
        substitute a         --> TIME,937521357,GT,value,b,IF
        substitute b         --> TIME,937521357,GT,value,value,UN,0,value,IF,IF

     We end up with:
     `CDEF:result=TIME,937521357,GT,value,value,UN,0,value,IF,IF'

     This looks very complex however as you can see it was not
     too hard to come up with.

     EEEExxxxaaaammmmpppplllleeee:::: PPPPrrrreeeetttteeeennnnddddiiiinnnngggg wwwweeeeiiiirrrrdddd ddddaaaattttaaaa iiiissssnnnn''''tttt tttthhhheeeerrrreeee

     Suppose you have a problem that shows up as huge spikes in
     your graph.  You know this happens and why so you decide to
     work around the problem.  Perhaps you're using your network
     to do a backup at night and by doing so you get almost
     10mb/s while the rest of your network activity does not
     produce numbers higher than 100kb/s.





2001-02-20             Last change: 1.0.33                     10






rrdtool                                           CDEFTUTORIAL(1)



     There are two options:

     1.  If the number exceeds 100kb/s it is wrong and you want
         it masked out by changing it into unknown

     2.  You don't want the graph to show more than 100kb/s

     Pseudo code: if (number > 100) then unknown else number or
     Pseudo code: if (number > 100) then 100 else number.

     The second "problem" may also be solved by using the rigid
     option of rrdtool graph, however this has not the same
     result. In this example you can end up with a graph that
     does autoscaling. Also, if you use the numbers to display
     maxima they will be set to 100kb/s.

     We use "IF" and "GT" again. "if (x) then (y) else (z)" is
     written down as "CDEF:result=x,y,z,IF"; now fill in x, y and
     z.  For x you fill in "number greater than 100kb/s" becoming
     "number,100000,GT" (kilo is 1000 and b/s is what we
     measure!).  The "z" part is "number" in both cases and the
     "y" part is either "UNKN" for unknown or "100000" for
     100kb/s.

     The two CDEF expressions would be:

         CDEF:result=number,100000,GT,UNKN,number,IF
         CDEF:result=number,100000,GT,100000,number,IF


     EEEExxxxaaaammmmpppplllleeee:::: wwwwoooorrrrkkkkiiiinnnngggg oooonnnn aaaa cccceeeerrrrttttaaaaiiiinnnn ttttiiiimmmmeeee ssssppppaaaannnn

     If you want a graph that spans a few weeks, but would only
     want to see some routers data for one week, you need to
     "hide" the rest of the time frame. Don't ask me when this
     would be useful, it's just here for the example :)

     We need to compare the time stamp to a begin date and an end
     date.  Comparing isn't difficult:

             TIME,begintime,GE
             TIME,endtime,LE

     These two parts of the CDEF produce either 0 for false or 1
     for true.  We can now check if they are both 0 (or 1) using
     a few IF statements but, as Wataru Satoh pointed out, we can
     use the "*" or "+" functions as locical AND and locical OR.

     For "*", the result will be zero (false) if either one of
     the two operators is zero.  For "+", the result will only be
     false (0) when two false (0) operators will be added.
     Warning: *any* number not equal to 0 will be considered



2001-02-20             Last change: 1.0.33                     11






rrdtool                                           CDEFTUTORIAL(1)



     "true". This means that, for instance, "-1,1,+" (which
     should be "true or true") will become FALSE ...  In other
     words, use "+" only if you know for sure that you have
     positive numbers (or zero) only.

     Let's compile the complete CDEF:

             DEF:ds0=router1.rrd:AVERAGE
             CDEF:ds0modified=TIME,begintime,GE,TIME,endtime,LE,*,UNKN,ds0,IF

     This will return the value of ds0 if both comparisons return
     true. You could also do it the other way around:

             DEF:ds0=router1.rrd:AVERAGE
             CDEF:ds0modified=TIME,begintime,LT,TIME,endtime,GT,+,UNKN,ds0,IF

     This will return an UNKNOWN if either comparison returns
     true.

     EEEExxxxaaaammmmpppplllleeee:::: YYYYoooouuuu ssssuuuussssppppeeeecccctttt ttttoooo hhhhaaaavvvveeee pppprrrroooobbbblllleeeemmmmssss aaaannnndddd wwwwaaaannnntttt ttttoooo sssseeeeeeee
     uuuunnnnkkkknnnnoooowwwwnnnn ddddaaaattttaaaa....

     Suppose you add up the number of active users on several
     terminal servers.  If one of them doesn't give an answer (or
     an incorrect one) you get "NaN" in the database ("Not a
     Number") and NaN is evaluated as Unknown.

     In this case, you would like to be alerted to it and the sum
     of the remaining values is of no value to you.

     It would be something like:

         DEF:users1=location1.rrd:onlineTS1:LAST
         DEF:users2=location1.rrd:onlineTS2:LAST
         DEF:users3=location2.rrd:onlineTS1:LAST
         DEF:users4=location2.rrd:onlineTS2:LAST
         CDEF:allusers=users1,users2,users3,users4,+,+,+

     If you now plot allusers, unknown data in one of
     users1..users4 will show up as a gap in your graph. You want
     to modify this to show a bright red line, not a gap.

     Define an extra CDEF that is unknown if all is okay and is
     infinite if there is an unknown value:

         CDEF:wrongdata=allusers,UN,INF,UNKN,IF

     "allusers,UN" will evaluate to either true or false, it is
     the (x) part of the "IF" function and it checks if allusers
     is unknown.  The (y) part of the "IF" function is set to
     "INF" (which means infinity) and the (z) part of the
     function returns "UNKN".



2001-02-20             Last change: 1.0.33                     12






rrdtool                                           CDEFTUTORIAL(1)



     The logic is: if (allusers == unknown) then return INF else
     return UNKN.

     You can now use AREA to display this "wrongdata" in bright
     red. If it is unknown (because allusers is known) then the
     red AREA won't show up.  If the value is INF (because
     allusers is unknown) then the red AREA will be filled in on
     the graph at that particular time.

        AREA:allusers#0000FF:combined user count
        AREA:wrongdata#FF0000:unknown data


     SSSSaaaammmmeeee eeeexxxxaaaammmmpppplllleeee uuuusssseeeeffffuuuullll wwwwiiiitttthhhh SSSSTTTTAAAACCCCKKKKeeeedddd ddddaaaattttaaaa::::

     If you use stack in the previous example (as I would do)
     then you don't add up the values. Therefore, there is no
     relationship between the four values and you don't get a
     single value to test.  Suppose users3 would be unknown at
     one point in time: users1 is plotted, users2 is stacked on
     top of users1, users3 is unknown and therefore nothing
     happens, users4 is stacked on top of users2.  Add the extra
     CDEFs anyway and use them to overlay the "normal" graph:

        DEF:users1=location1.rrd:onlineTS1:LAST
        DEF:users2=location1.rrd:onlineTS2:LAST
        DEF:users3=location2.rrd:onlineTS1:LAST
        DEF:users4=location2.rrd:onlineTS2:LAST
        CDEF:allusers=users1,users2,users3,users4,+,+,+
        CDEF:wrongdata=allusers,UN,INF,UNKN,IF
        AREA:users1#0000FF:users at ts1
        STACK:users2#00FF00:users at ts2
        STACK:users3#00FFFF:users at ts3
        STACK:users4#FFFF00:users at ts4
        AREA:wrongdata#FF0000:unknown data

     If there is unknown data in one of users1..users4, the
     "wrongdata" AREA will be drawn and because it starts at the
     X-axis and has infinite height it will effectively overwrite
     the STACKed parts.

     You could combine the two CDEF lines into one (we don't use
     "allusers") if you like.  But there are good reasons for
     writting two CDEFS:

     +o   It improves the readability of the script

     +o   It can be used inside GPRINT to display the total number
         of users

     If you choose to combine them, you can substitute the
     "allusers" in the second CDEF with the part after the equal



2001-02-20             Last change: 1.0.33                     13






rrdtool                                           CDEFTUTORIAL(1)



     sign from the first line:

        CDEF:wrongdata=users1,users2,users3,users4,+,+,+,UN,INF,UNKN,IF

     If you do so, you won't be able to use these next GPRINTs:

        COMMENT:"Total number of users seen"
        GPRINT:allusers:MAX:"Maximum: %6.0lf"
        GPRINT:allusers:MIN:"Minimum: %6.0lf"
        GPRINT:allusers:AVERAGE:"Average: %6.0lf"
        GPRINT:allusers:LAST:"Current: %6.0lf\n"


TTTThhhheeee eeeexxxxaaaammmmpppplllleeeessss ffffrrrroooommmm tttthhhheeee rrrrrrrrdddd ggggrrrraaaapppphhhh mmmmaaaannnnuuuuaaaallll ppppaaaaggggeeee
     DDDDeeeeggggrrrreeeeeeeessss CCCCeeeellllcccciiiiuuuussss vvvvssss.... DDDDeeeeggggrrrreeeeeeeessss FFFFaaaahhhhrrrreeeennnnhhhheeeeiiiitttt

        rrdtool graph demo.gif --title="Demo Graph" \
           DEF:cel=demo.rrd:exhaust:AVERAGE \
           CDEF:far=cel,32,-,0.55555,* \
           LINE2:cel#00a000:"D. Celsius" \
           LINE2:far#ff0000:"D. Fahrenheit\c"

     This example gets the DS called "exhaust" from database
     "demo.rrd" and puts the values in variable "cel". The CDEF
     used is evaluated as follows:

        CDEF:far=cel,32,-,0.5555,*
        1. push variable "cel"
        2. push 32
        3. push function "minus" and process it
           The stack now contains values that are 32 less than "cel"
        4. push 0.5555
        5. push function "multiply" and process it
        6. the resulting value is now "(cel-32)*0.55555"

     Note that if you take the celcius to fahrenheit function you
     should be doing "5/9*(cel-32)" so 0.55555 is not exactly
     correct. It is close enough for this purpose and it saves a
     calculation.

     CCCChhhhaaaannnnggggiiiinnnngggg uuuunnnnkkkknnnnoooowwwwnnnn iiiinnnnttttoooo zzzzeeeerrrroooo

        rrdtool graph demo.gif --title="Demo Graph" \
           DEF:idat1=interface1.rrd:ds0:AVERAGE \
           DEF:idat2=interface2.rrd:ds0:AVERAGE \
           DEF:odat1=interface1.rrd:ds1:AVERAGE \
           DEF:odat2=interface2.rrd:ds1:AVERAGE \
           CDEF:agginput=idat1,UN,0,idat1,IF,idat2,UN,0,idat2,IF,+,8,* \
           CDEF:aggoutput=odat1,UN,0,odat1,IF,odat2,UN,0,odat2,IF,+,8,* \
           AREA:agginput#00cc00:Input Aggregate \
           LINE1:aggoutput#0000FF:Output Aggregate




2001-02-20             Last change: 1.0.33                     14






rrdtool                                           CDEFTUTORIAL(1)



     These two CDEFs are built from several functions. It helps
     to split them when viewing what they do.  Starting with the
     first CDEF we would get:
           idat1,UN --> a
           0        --> b
           idat1    --> c
           if (a) then (b) else (c) The result is therefore "0"
     if it is true that "idat1" equals "UN".  If not, the
     original value of "idat1" is put back on the stack.  Lets
     call this answer "d". The process is repeated for the next
     five items on the stack, it is done the same and will return
     answer "h". The resulting stack is therefore "d,h".  The
     expression has been simplified to "d,h,+,8,*" and it will
     now be easy to see that we add "d" and "h", and multiply the
     result with eight.

     The end result is that we have added "idat1" and "idat2" and
     in the process we effectively ignored unknown values. The
     result is multiplied by eight, most likely to convert
     bytes/s to bits/s.

     IIIInnnnffffiiiinnnniiiittttyyyy ddddeeeemmmmoooo

        rrdtool graph example.png --title="INF demo" \
           DEF:val1=some.rrd:ds0:AVERAGE \
           DEF:val2=some.rrd:ds1:AVERAGE \
           DEF:val3=some.rrd:ds2:AVERAGE \
           DEF:val4=other.rrd:ds0:AVERAGE \
           CDEF:background=val4,POP,TIME,7200,%,3600,LE,INF,UNKN,IF \
           CDEF:wipeout=val1,val2,val3,val4,+,+,+,UN,INF,UNKN,IF \
           AREA:background#F0F0F0 \
           AREA:val1#0000FF:Value1 \
           STACK:val2#00C000:Value2 \
           STACK:val3#FFFF00:Value3 \
           STACK:val4#FFC000:Value4 \
           AREA:whipeout#FF0000:Unknown

     This demo demonstrates two ways to use infinity. It is a bit
     tricky to see what happens in the "background" CDEF.

        "val4,POP,TIME,7200,%,3600,LE,INF,UNKN,IF"

     This RPN takes the value of "val4" as input and then
     immediately removes it from the stack using "POP". The stack
     is now empty but as a side result we now know the time that
     this sample was taken.  This time is put on the stack by the
     "TIME" function.

     "TIME,7200,%" takes the modulo of time and 7200 (which is
     two hours).  The resulting value on the stack will be a
     number in the range from 0 to 7199.




2001-02-20             Last change: 1.0.33                     15






rrdtool                                           CDEFTUTORIAL(1)



     For people who don't know the modulo function: it is the
     remainder after an integer division. If you divide 16 by 3,
     the answer would be 5 and the remainder would be 1. So,
     "16,3,%" returns 1.

     We have the result of "TIME,7200,%" on the stack, lets call
     this "a". The start of the RPN has become "a,3600,LE" and
     this checks if "a" is less or equal than "3600". It is true
     half of the time.  We now have to process the rest of the
     RPN and this is only a simple "IF" function that returns
     either "INF" or "UNKN" depending on the time. This is
     returned to variable "background".

     The second CDEF has been discussed earlyer in this document
     so we won't do that here.

     Now you can draw the different layers. Start with the
     background that is either unknown (nothing to see) or
     infinite (the whole positive part of the graph gets filled).
     Next you draw the data on top of this background. It will
     overlay the background. Suppose one of val1..val4 would be
     unknown, in that case you end up with only three bars
     stacked on top of each other.  You don't want to see this
     because the data is only valid when all four variables are
     valid. This is why you use the second CDEF, it will overlay
     the data with an AREA so the data cannot be seen anymore.

     If your data can also have negative values you also need to
     overwrite the other half of your graph. This can be done in
     a relatively simple way: what you need is the "wipeout"
     variable and place a negative sign before it:
     "CDEF:wipeout2=wipeout,-1,*"

OOOOuuuutttt ooooffff iiiiddddeeeeaaaassss ffffoooorrrr nnnnoooowwww
     This document was created from questions asked by either
     myself or by other people on the list. Please let me know if
     you find errors in it or if you have trouble understanding
     it. If you think there should be an addition, mail me:
     <alex@ergens.op.het.net>

     Remember: NNNNoooo ffffeeeeeeeeddddbbbbaaaacccckkkk eeeeqqqquuuuaaaallllssss nnnnoooo cccchhhhaaaannnnggggeeeessss!!!!

SSSSEEEEEEEE AAAALLLLSSSSOOOO
     The RRDtool manpages

AAAAUUUUTTTTHHHHOOOORRRR
     Alex van den Bogaerdt <alex@ergens.op.het.net>








2001-02-20             Last change: 1.0.33                     16



