- implement remote cases of unw_resume()

- implement unw_resume() for the case where the current register frame is split
  across multiple backing stores

- document restricions on using unw_resume():
	- can resume only to routines expecting it (such as routines with
	  exception handlers)
	- cannot set non-preserved state, other than "exception argument"
	  registers => cannot unw_resume() into routines using special
	  register-usage conventions, such as routines using an agreed-upon
	  scratch reg for storing some value

- allow region-length (insn_count) in unw_dyn_region_info_t to be negative
  to indicate counting from the end of the procedure (to make it possible
  for differently-sized procedures to share the same region list if they
  share the same prologue/epilogue).

- it appears that it is currently not possible to read register UNW_IA64_TP;
  fix that

- use pthread-mutexes where necessary, atomic ops where possible

Testing:
	- ensure that saving r4-r7 in a stacked register properly preserves
	  the NaT bit, even in the face of register-rotation
	- ensure that IA64_INSN_MOVE_STACKED works correctly in the face of
	  register rotation

=== taken care of:

+ cache the value of *cfm_loc; each rotate_FOO() call needs it!
+ implement the remote-lookup of the dynamic registration list
+ when doing sigreturn, must restore fp regs (and perhaps other regs) the same
  way as the (user-level) gate.S sigreturn path does!
+ unw_resume() must at least restore gp (r1)!  consider restoring all
  scratch regs (but what's the performance impact on exception handling?);
  alternative: restore scratch regs that may be used during procedure
  call/return (e.g., r8-r11, f8-f11)
