[prev in list] [next in list] [prev in thread] [next in thread] 

List:       orocos-dev
Subject:    [Orocos-Dev] Towards an UML 2.0 compliant state machine implementation
From:       Herman.Bruyninckx () mech ! kuleuven ! be (Herman Bruyninckx)
Date:       2009-02-26 8:17:05
Message-ID: alpine.DEB.2.00.0902260916460.6229 () roble
[Download RAW message or body]

on thu, 26 feb 2009, markus klotzb?cher wrote:

> on wed, feb 25, 2009 at 09:07:53pm +0100, herman bruyninckx wrote:
>> on tue, 24 feb 2009, markus klotzb?cher wrote:
>>
>>> please find my suggestions and thoughts regarding the orocos state
>>> machine redesign below. apologies for the long email, but the topic
>>> has become quite involved :-)
>>
>> indeed. i recently read somewhat about the computation, communication,
>> configuration and coordination concerns for complex software systems, and i
>> think this is the conceptual framework we are missing in orocos design to
>> make our ideas clear, and to provide the appropriate decoupling!
>> (it will also make klaas happy, because his favourite "models of
>> computation" concept fits naturally within this 4c...)
>> i am referring to @incollection{      radestockeisenbach1996,
>>   author          = {radestock, matthias and eisenbach, susan},
>>   title           = {coordination in evolving systems},
>>   booktitle       = {book: trends in distributed systems corba and beyond},
>>   pages           = {162--176},
>>   year            = {1996}
>> }
>>
>> so, in my answers and remarks below, i will refer to one or more of these
>> 4cs quite often.
>>
>>> summary:
>>> --------
>>>
>>> we want uml 2.0 style state machines for various reasons. timeevents,
>>> real composite states and and-states are powerful mechanisms;
>>> furthermore we are now starting to work on integrating with various
>>> model driven engineering tools, so being (reasonably) in line with the
>>> major standards will make this much easier.
>>>
>>> these are the major questions:
>>>
>>> - how integrate a state machine into an orocos component
>>> - how to deal with uml 2.0 variation points in a flexible way
>>> - how to go about with scripting
>>>
>>>
>>> integration into orocos components
>>> -----------------------------------
>>>
>>> first of all (as discussed before) a state machine will consists of
>>> two parts an synchronous core and an asynchronous layer around it:
>>>
>>> 1) the core layer supports a minimal interface for controlling a state
>>> machine: starting, stopping, resetting, querying status, queuing an
>>> event and requesting a step. the step call will only return once one a
>>> run to completion step (or the maximum allowed supersteps) have been
>>> executed.
>>
>> the concept of a "step" is not clear, in my opinion. it does not belong to
>> the traditional (and uml standardized) fsm concepts, so it should not be in
>> the core!
>
> i disagree! although i should have better written "run-to-completion
> step" which is the smallest execution unit of an uml state machine
> (see uml superstructure spec, pg. 563). quoting from that page:
>
> "the processing of a single event occurrence by a state machine is
> known as a run-to-completion step."

ok, i stand corrected! so, please always use the official terminology _if_
it exists, in order to avoid confusion.

what does uml tell us about the computational aspects of "run-to-completion
steps"?

>>> 2) the asynchronous layer can be seen as the glue which ties a state
>>> machine to an orocos component. it defines which events are delivered
>>> to a state machine queue and when and how the state machine is
>>> advanced.
>>
>> similar remark: "asynchronous layer" is not a well-defined concept.
>
> ok, agreed. the name might be a bad choice but how would you call
> this? integration layer?

"coordination" (as one of the four cs, together with computation,
communication and configuration). communication and coordination deal with
asynchronicity, the others shouldn't (if this is possible...).

>>> i identified three major use cases:
>>>
>>> 1) synchronous tracking: a state machine is to be used for tracking
>>> the state of a component and is advanced on each callevent
>>> received.
>> ok.
>>
>>> the state machine needs to be stepped at after each method
>>> invocation.
>> needs clarification, because of the "step"-feature.
>
> a method invocation generates a uml::callevent. by stepping i mean
> executing one run-to-completion step, thereby processing this
> event. of course it is not guaranteed that this event will be
> processed as this depends on the amount of events in the queue and how
> is dealt with supersteps.

this is "configuration"? what does uml say about which policies are
valid/allowed/...? or is this a "semantic deviation point"?

>>> 2) synchronous constraining: a state machine is stepped before the
>>> requested operation is executed and is in control of actually allowing
>>> or denying the it. this is the typical use case of a protocol state
>>> machine.
>> please explain.
>
> a uml protocol state machine serves to define which operations of an
> object are allowed to be executed in which state, in which order and
> which pre- and postconditions must hold. transitions are labeled with
> precondition, callevent, postcondition.

ok, this is all standard uml-speak? do the precondition and postcondition
also involve their own states/transitions/steps/...?

> the goal is of course seperation of concerns: seperate the mechanism a
> object provides from usage policy. implementationwise this could be
> generated statically or, more configurable, be evaluated at
> runtime. above i described the latter.
>
>>> 3) asynchronous: a state machine never gives up control, reacts
>>> immediately to events and therefore requires its own thread of
>>> execution.
>> this is a tricky "feature": asynchronicity and state machine don't go
>> together. by definition, because a state machine is about changing the
>> behaviour of _one single activitiy_, in a synchronous way...
>
> i get your point, and i think my choice of "asynchronous" might be
> unfortunate here again. maybe "master" would be better.

that terminoogy doesn't make my understanding any deeper... :-)

> the use case i
> imagined here is a state machine in its own thread

...by definition...

> which reacts to
> events as they arrive. for instance when data arrives on a port and an
> event is generated the state machine could read it, invoke some
> processing and write it to some out port. that would qualify as
> synchronous, wouldn't it?

yes! and maybe we should make it _atomic_ too...? (The "Genom" line of work
from LAAS in Toulouse has taken atomic execution (of so-called "codels") as
their main Computation primitive; and I start to like it more and more, in
some contexts; such as this one of the state machine execution)
The question then is: atomic wrt to which other activities... brobably it's
enough to have atomicity wrt to processing the other events in the same
state machine. a "guaranteed commit" processing, in data base speak. or in
other words: for the event handling infrastructure, a state can always be
only in one determined state, and time is "frozen" for the event handling
infrastructure when a state is making a transition.


> the only difference to the other use cases
> is that the behavior of the component is determined by the set of
> events reacted to by its state machine, and not the external trigger.

i don't understand...

> but maybe this is misuse of a state machine and really the use case
> for a script (activity diagram).
interesting remark! i don't have a clear answer however...

>> i do see asynchronous behaviour coordination needs in many applications,
>> but i have lost the conviction that fsms are the way to go here.
>
> i'm curious, what would you consider better suited? petri nets?
that's the simplest concept that i know that explicitly carries the concept
of activities, hence of asynchronicity. like state machines, petri nets
come in many colours and flavours, so also there we have to be able to sort
out what the 4cs are, and what (communication) mechanism and (coordination)
policy is.

>>> 1) requires the following to happen in a taskcontext:
>>> - generate a (local) callevent when a method is invoked and store it
>>>   into the state machine event queue.
>>> - step the state machine.
>> similar remarks as above.
>>
>>> probably a state machine will not be interested in _all_ callevents,
>>> so adding and stepping at each callevent would create needless
>>> overhead. a mechanism which allows a state machine to subscribe to
>>> interested events would solve this.
>> that's a configuration responsibility, not something the state machine
>
> i think this depends on how tight the coupling between a component and
> a state machine is.
that should be a 100% tight coupling: the state machine defines the
behaviour of the component! unless we discriminate between single-activity
components and multi-activity components... the latter should get a
different name, i think.

> if its loose i would agree, one could think of
> deploying and configuring a state machine "on to" a component.
>
>> itself should be responsible for. and in addition: the state machine
>> should be automatically subscribed to the events that correspond to its
>> state transition events, isn't it?
>
> yes, absolutely. my fault for not differentiating between state
> machine instance and state machine implementation. the state machine
> implementation will subscribe to the events required for processing a
> certain instance of a state machine. or as you suggest above a state
> machine instance could be configured to run on a certain component.
>
>>> 2) this is a special case of subscribing to a callevent, but which
>>> causes the actual invocation of the requested operation to be
>>> omitted. the state machine then determines whether or not this
>>> operation is executed and executes it itself.
>>
>> the exact meaning of "execution" in a state machine should be defined
>> better... it's not a clear concept to me (anymore).
>
> see above.
ok.

>>> 3) run state machine in a taskcontext with an nonperiodicactivity
>>> attached
>>
>> we should get rid of this "feature": "nonperiodicactivity" is a wrong
>> concept (i was partially responsible for its creation, sorry about that!),
>> since it (i) is a negative term (saying what something is _not_ is always
>> more confusing than saying what it _is_...) and (ii) it says it is not
>> something which is basically a special case in itself. this is terminology
>> that opens a cans of worms, as the past has proven.
>
> agreed.
>
>>> uml 2.0 variation points
>>> -------------------------
>>>
>>> there is really only one uml 2.0 variation point which is tricky,
>>> namely the changeevent which triggers a transition when a condition
>>> specified by an expression evaluates to true. peter has expressed his
>>> concern that such an event can be efficiently implemented. i agree;
>>> the required conditions (read the frequency) for evaluating this will
>>> be simply to conflicting.
>> i don't fully understand what you want to say here...
>
> a changeevent is denoted on a transition as an expression which
> returns a boolean result. when this expression becomes valid, the
> transition triggers. however uml leaves it completely open how and
> when (e.g. at fixed times or continuously) this expression is
> evaluated.

ok, this is an important "semantic deviation point"! (i like that name,
although it is way to "learned" :-))

And my feeling is that klaas' models of computation, or Radestock and
Eisenbach's 4Cs, are the appropriate concepts to use to define mechanism
and policy in this respect.

>>> The proposed default therefore is to evaluate this condition before
>>> the do behavior is executed. The rationale behind this is to allow the
>>> ChangeEvent to trigger a transition before the do behavior is executed
>>> (and thereby skipping its execution) while at the same time creating
>>> as little overhead as possible.
>>>
>>> For conditions that require more frequent evaluation there this
>>> Event type must be realized in terms of an individual component, to
>>> which an periodic activity with the desired frequency and priority can
>>> be attached.
>> Oops, I didn't realize we had made Orocos' runtime behaviour (and
>> its Coordination) that difficult to understand... :-)
>
> Huh, what am I missing?
What am _I_ missing? :-) I really didn't (and still don't) understand that
last paragraph of yours :-)

>>> Most other variation points are rather trivial:
>>>
>>> - Order of event dequeuing: default to FIFO.
>> Why is this "trivial"? Because only FIFO makes sense?
>
> No, because only FIFO makes sense *as a default*. But for instance
> when dealing with large amounts of events (so that the event queue is
> constantly filled) some prioritization scheme could make sense.
OK.

>>> - Allow ChangeEvents to persist even after they evaluate to false:
>>>   default to true. (We don't start searching and removing events from
>>>   the queue - double checks etc. can be performed in the next state)
>> _If_ queueing takes place, it's not the state machine that should be doing
>> it.
>
> As the detection of ChangeEvents involves evaluation of expressions
> which are part of the state machine description, I assume it to be a
> reasonable choice to evaluate these by the state machine
> implementation, which in turn would be responsible for queuing the
> event. Whoever queues the event should remove it too.
Yes!

>>> Supersteps
>>> ----------
>>>
>>> Supersteps are referred to as the behavior of a state machine which
>>> transitions across multiple states by being externally triggered
>>> once. This can happen e.g. if there are multiple events in a queue.
>>
>> I don't like this concept of "supersteps"... At least not for defining the
>> behaviour and implementation of state machines: the state machine is a
>> thing that changes state when events arrive. The scheduling of the code
>> that does this is not something that belongs to the specification of the
>> state machine. But "somewhere" else... in the Coordination part of RTT
>> (which has not yet been identified and designed very well, yet...).
>
> I agree.
>
>>> There are good reasons to allow or forbid this behavior in an
>>> application. For more deterministic execution it would certainly be
>>> better to forbid supersteps. But in a different scenario it might be
>>> desirable to process a full event queue and thereby "catch up" on the
>>> work to be done.
>>
>> These are _Configuration_, which should be done outside of state machine
>> definition and specification.
>
> Yes I agree. I don't see your last two paragraphs conflicting with
> what I wrote?
Then that's fine! :-)

>>> It is therefore proposed to forbid supersteps by
>>> default, but allow this to be overridden with a maximum allowed.
>> I don't think "forbidding" things is a good policy... We should support
>> _any_ policy on an equal basis, and maybe provide "OCL" templates that
>> motivate why certain policies are good in certain contexts.
>
> That's what I wrote, isn't it? Disallow by default but allow for
> configuration?
Okay, I now understand the nuance in your statement :-)

>>> Scripting
>>> ---------
>>>
>>> As explained in previous mails a state machine consists mostly of
>>> scripts (in the meaning of statements interpreted at run time).
>>> Therefore I've reached the conclusion that it would be very beneficial
>>
>> Do the scripts determine the _specification_ of the state machine, or the
>> event handling, or....?
>
> Most importantly these scripts determine the behavior of a state
> machine. An example for a script is a guard condition, which when
> evaluated determines whether a transition will be enabled or not. In a
> way the scripts also affect the specification, as the latter will have
> to adhere to the syntax of the first.
>
>>> to implement the underlying state machine logic itself in the
>>> scripting language.
>> This is a strange sentence: logic is not an "executable" concept...
>> State machines can describe Coordination, which should be decoupled from
>> Computation...
>
> Yes, but describing this coordination requires computation, namely
> computing the change in state given a certain input!
Agreed. But it would be very good if we could indefinitely keep the
abstract concept of a state machine separate from the implementation of its
(atomic?) execution.

> And while it is
> an implementation detail of whether this computation is implemented in
> Lua, C++ or whatever, this choice will have major impact on
> configurability, usability and also reusability!
Yes. Things will only be reusable if their meaning and implementation
effects are fully documented (and predictable).

>>> The main advantage of this would be a high level
>>> of flexibility for overriding and configuring semantics variation
>>> points. The main drawback is that such a light-weight, hard real-time
>>> scripting language expressive enough to make this feasible currently
>>> does not exist. The problem is of course due to the inherent
>>> difficulty of hard read time garbage collection.
>>>
>>> As lightweight, hard real-time scripting do not exist, a intermediate
>>> solution is required. One approach is the use of a soft real-time
>>> scripting language. Herman suggested some time ago to take a look at
>>> Lua [1], what I did. Lua is a mature, lightweight, MIT licensed
>>> programing language especially targeted to be embedded. A key feature
>>> of Lua is the recently added soft-real time incremental garbage
>>> collector that allows fine grained control of the collection
>>> process. Tuning these parameters makes it possible to balance temporal
>>> behavior against overall performance.
>>>
>>> Adopting Lua as the next generation scripting language for would have
>>> several benefits. First of all, it provides a rather non-intrusive
>>> migration path towards a more powerful extension language. Secondly
>>> state machines could be initially implemented in Lua, profiled (Lua
>>> has support for this) and then optimized by moving parts of the
>>> implementation to C/C++. The foreign function interface for this is
>>> very efficient and simple to use.
>>>
>>> Of course the downside of this approach would be a (initial) decrease
>>> in hard real time capabilities. However I believe this would be
>>> acceptable because
>>>
>>> - an application that has really critical real-time requirements will
>>>   implement this particular part in C/C++ anyway.
>>>
>>> - The amount of such critical code is relatively small compared to
>>>   the amount of code for which slightly softer real time constraints
>>>   would suffice
>>>
>>> - There is more optimization possible for periodically executing
>>>   scripts and state machines. Adaptive garbage collection algorithms
>>>   could be developed for learning to precisely schedule garbage
>>>   collection within the idle period.
>>>
>>>
>>> That's it!
>>>
>>> I'm sure there are many more issues yet to be discovered, but I think
>>> it's time to cut loose. Am I given the green light?
>>
>> No! :-) These issues should be _absolutely_ clear, and currently our
>> "insights" are still coupling the 4Cs too much...
>
> I knew it ;-)
>
> Best regards
> Markus

Herman

--
   K.U.Leuven, Mechanical Eng., Mechatronics & Robotics Research Group
     <http://people.mech.kuleuven.be/~bruyninc> Tel: +32 16 322480
   Coordinator of EURON (European Robotics Research Network)
     <http://www.euron.org>
   Open Realtime Control Services <http://www.orocos.org>

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic