[prev in list] [next in list] [prev in thread] [next in thread] 

List:       jakarta-commons-dev
Subject:    Re: [math] major problem with new released version 3.1
From:       Gilles Sadowski <gilles () harfang ! homelinux ! org>
Date:       2012-12-30 22:26:53
Message-ID: 20121230222652.GP24843 () dusk ! harfang ! homelinux ! org
[Download RAW message or body]

Hello.

[As Phil suggested, please stop extending this thread; and rather open new
ones on specific points in order to focus the discussion(s).]

> > Yes, abd I agree with that. However, I found the javadoc to be
> > ambiguous. It says "The following data will be looked for:" followed by
> > a list. There is no distinction between optional and required
> > parameters. As an example, simple bounds are optional whereas initial
> > guess or weight are required, but there is nothing to tell it to user.
> > So in this case, either we should provide proper exception or proper
> > documentation. I am OK with both.
> >
> >> It is however straightforward to add a "checkParameters()" method that would
> >> raise more specific exceptions. [Although that would contradict the
> >> conclusion of the previous discussion about NPE in CM. And restart it,
> >> without even getting a chance to go forward with what had been decided!]
> > As long as we identify the parameters that are optional (and hence user
> > can deduce the other one are mandatory and will raise an NPE), this
> > would be fine. I don't ask to restart this tiring discussion, just make
> > sure users have the proper way to understand why they get an NPE when
> > the forget weight and why they don't get one when the forget simple bounds.
> 
> +1 - can we just make it clear in the javadoc what is optional, what
> is required and add tests to illustrate?  I assume it is intractable
> to just add full signatures for the activations that "work?"

In that context, "optional" actually means that "optimize" can be called a
_second_ time (or more) without specifying arguments that were given in a
previous call. I'll add the clarification in the Javadoc.
[I'm aware that this is not a common behaviour in CM, or in Java for that
matter, or in the usual way to code in a strongly typed language, but it
greatly simplifies the code when API tries to represent (too?) many things
at the same time. This is (more) commonly used in other languages e.g. to
allow default values for many parameters. Also, I don't claim that we should
stick with that, but I was led to this design by the goal to stay relatively
close to the existing classes and features while removing many of the
confusing multiplication of interfaces and classes. More on that below...]

No argument is optional in the sense that if the actual optimizer
implementation will be using the argument, it must of course be provided by
the caller.
For example, if "CMAESOptimizer" is used, then the "SimpleBounds" _must_ be
provided (or an NPE will be raised when "CMAESOptimizer" tries to retrieve
th bound values). If "PowellOptimizer" is used, specifying "SimpleBounds"
will have no effect (because "PowellOptimizer" does not call the accessor to
the bound values).

> > Also weight should really be optional and have a fallback to identity
> > matrix, but this is another story.
> 
> +1 - and while I have not yet really penetrated this code,

Given what I wrote above, this should be -1. I think that it is simply
dangerous to have defaults (in the sense that users are not made aware that
an algorithms _need_ and _use_ user input) on low-level functions. This can
be misleading. Anyways, I hope that this point will not need further
discussion since we all seem to agree that the "weight" feature should be
dropped ASAP.

> I have
> sympathy for Konstantin's views that it might be better to actually
> separate the impls.  I don't want to open a can of worms forcing us
> to add yet more complexity by merging impls; but we implemented a
> special case of least squares in the stat.regression, where we
> separated GLS from OLS, allowing the OLS code to be simpler. 
> Separation turned out to make things easier both for users and us
> there.  The setup is not perfect, but it is documentable and
> maintainable.  Separation also had the benefit that bugs and lack of
> specification clarity in early versions of GLS did not impact OLS.

As described above in the little rationale for why we are where we are now
in the design of the "optim" package, the crux of the problem was stressed
by Dimitri; and if we were to indeed separate "optimization" (minimizing a
cost) from "fitting" (matching a set of model functions), it would unwind a
tangle of partly conflicting functionalities, that led to e.g. the current
need to have generics classes depending on the "optimize" method's return
value. [This type either contains the (scalar) cost, or the (vector) model
function values.]

Hence I propose that we first change what is actually broken (the basics fo
the problems' description) before focusing on supposedly "bad OO
implementation" or implementation details.

Sometimes we have to start from a clean slate, instead of piling up
workarounds.
So, I'm requesting advice on a _set of assumptions_ on which to base a new
design of both the "optim" and "fitting" packages.
[IIUC, everything currently under "optim.nonlinear.vector" should go to
"fitting". Please confirm (in another ML thread).]

> Can we agree at this point that Gilles' commit in r1426758 resolves
> the issue that started this thread (MATH-924)?

I agree. :-)

> I am also +1 on the following:
> 
> 0) continuing in the direction started in r1426758, adding
> specialized matrix impls

+1
That would also enable to make progress in the cleaning up of
"BOBYQAOptimizer" (that uses explicit loops actually representing
operations on e.g. symmetric matrices).

> 1) "undeprecating" the sparse impls

+0
Simply undeprecating is not enough. As we discussed, several times,
the current matrix interface is too complicated/verbose (cf. summary
assembled by Sébastrien on JIRA) so that existing implementations
are sometimes inefficient. Creating new specialized implementations is
more cumbersome than necessary.

> 2) moving to in-place vector/matrix operations within
> implementations as much as possible

-1
As I was first pointed at here, on this mailing list, mutable classes are
not necessarily more efficient; rather they were demonstrated to be less so
in a number of cases. Unless there is at least one realistic benchmark that
indicates a huge efficiency gain, we should not shoot ourselves in the foot
by having to maintain classes that are, by definition, more fragile.
[E.g. instead of shiny words like "GPU", it would be rather more intructive
if we could _see_ code making such calls...]

> Item 2) creates some tension with the objective of having the impls
> reflect the mathematics (Gilles good point about "self-documenting
> code"), but there may be clever ways to maintain readability / good
> OO structure while achieving large efficiency gains using the
> visitor pattern or Konstantin's suggestion of an InPlace interface.
> Lets look at the Mahout classes for some inspiration there.  IIUC
> the proposals, this also creates tension with the objective to favor
> immutability.

Cf. my previous point.
And also: "First make a code work; then make it faster, if really
necessary."

Then, if we have _actual_ use cases, we should lay out what options we have
to achieve the "faster code" goal.
As I pointed out previously, some of the CM functionalities could certainly
take advantage of the concurrency features in Java 6 and 7. Thoses tools are
there, maintained and improved by a community probably several order of
magnitudes larger than CM; it would be insane to embark on reinventing those
wheels because we would want to take advantage of multi-threading while
maintaining Java 5 compatibility.

> Regarding 1), any suggestions on which issues to work on /
> approaches to try first?

What about starting from scratch? Define a bare-bones interface, and develop
specialized implementations in parallel, adding benchmark test cases for all
the "critical" methods. [The situation is not that bad, since
"Array2DRowRealMatrix" (for small to midsize number of entries), and
"BlockRealMatrix (for larger numbers of entries) work pretty decently, I
think. However it would be worth separating "basic" (aka efficiency
critical) operations from higher level ones (to be implemented in other
classes (using inheritance or composition/delegation).]


I look forward to reading from you in other threads, ;-)
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic