'Re: [RT] Fallback-endabled data directory per publication?'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lenya-dev
Subject:    Re: [RT] Fallback-endabled data directory per publication?
From:       solprovider () apache ! org
Date:       2008-06-30 16:51:54
Message-ID: dfe834320806300951q559ff948p2d59293ead6069bb () mail ! gmail ! com
[Download RAW message or body]

On 6/30/08, Andreas Hartmann <andreas@apache.org> wrote:
> solprovider@apache.org schrieb:
> > On 6/27/08, Andreas Hartmann <andreas@apache.org> wrote:
> > > solprovider@apache.org schrieb:
> > allowing the repository to exist anywhere means allowing the
> > repository to exist anywhere.  The configuration would be one line:
> >   <content type="flat" location="somewhere"/>
>  How about <data location="somewhere"/> for each publication, maybe with
> fallback to a global setting? That would at least reduce the number of
> configuration options.

I like the idea to move Publication configuration to a file in the
pubs directory.  The configuration system may (someday) check:
   pubs/{mypub}.xml
   pubs/{mypub}/config/publication.xconf
   pubs/{mypub}/publication.xconf
And default to Lenya 1.2's pubs/{mypub}/content if nothing is specified.

> > My definition of a Publication is the intersection of content,
> > security, and code.
> > Other combinations should allow for any possible business
> > requirements.  Does a reason exist for Publications to NOT become more
> > flexible?
>  If we can achieve this flexibility without making the configuration more
> difficult, I'm much in favor of it. Actually I think that matters will
> become more ovious with a clearer separation of aspects (content, layout,
> access control).
>  Regarding layout, I think we do quite well. IMO the current concept allows
> for quite a lot of flexibility:
>  - layout can be put in modules
>  - shareable across multiple publications
>  - customizable using publication templating
Same in Lenya 1.3.

>  Regarding access control, the situation is not so clear. Policies should
> IMO be tied to the content. Principals (users, groups) can be shared across
> publications, but the GUI is still publication-specific, which is IMO a
> legacy that should be overcome.

The new security system (to be programmed any day now) will assume
that any Resource could be accessed through multiple Publications.
Every resource will specify none/read/write by any Publication-based
User/Group/Role and be able to inherit from another Resource using a
Publication-based Structure.  Does a possible security model exist
that cannot be configured using this system?

>  Regarding content, we still have a clear separation between publications.
> IMO a global content store would be an interesting option. The scenario you
> described above could be achieved by exporting and importing the content
> used by a single publication. But I see some organizational issues (content
> ownership etc.) if a content item isn't assigned to a particular
> publication, which would have to be discussed.

Lenya 1.3 uses Indexes.  Indexes are configurable to select based on
Structures and the Resources.  Implementing an Index that selects
based on whether a Resource has a property MYPUB="1" should be simple
(although I have not tried this.)  The only issue is the "ALL" Index
(used by the "edit" Module) includes every Resource.  Changing that is
simple if we can standardize how to distinguish which Resources belong
to a Publication.  We also need administration screens for adding
existing Resources to a Publication and displaying orphaned Resources.

> > > > - have multiple repository types: Lenya XML, JCR, traditional
> > > > relational database, etc..
> > > >
> > >  We once started with this approach. It had almost destroyed the
> project, if
> > > we hadn't pulled the handbrake in time. The maintenance costs for one
> > > repository implementation are high, for multiple repository
> implementations
> > > they are not bearable by a small community like ours.
> > I watched this process.  I think the design methodology caused more
> > issues than maintaining multiple repositories.  Lenya 1.3 abstracted
> > the Content API and added the content: protocol BEFORE adding a new
> > repository type; see below.
>  We also introduced a repository protocol (lenya://) before adding the
> repository type. A major problem was that the publication and repository
> APIs weren't stable yet, which caused frequent changes to the repository
> implementations. On the other hand, it was still possible to influence the
> API based on the capabilities of the repository. But in the end it was just
> too much work.

I defined the protocols, then implemented the APIs to support the
protocols.  Technically, the Content API is not finished because it
still supports Lenya 1.2's hierarchical content for
backwards-compatibility.  I will be very happy when I can develop
without testing that Lenya 1.2 Publications still work.  The original
protocols are defined in 13HELP.txt; content: and module: handle
everything a normal developer needs.  (The new protocols design: and
structure: for the new "Design Resources" will be in the next commit.)

> > I feel keeping an XML file-based repository is important for two reasons:
> >
> > 1. Troubleshooting - Is a bug in Lenya or in the repository?   Having
> > a Lenya XML file repository eliminates or confirms most of Lenya's
> > code as the source of the bug; does the bug affect the file-based
> > repository?  If yes, the bug is in Lenya's core.  If no, then the bug
> > is either in the JCR implementation or Lenya's JCR Content package.
> > Without control of one repository, this is not possible.
>  IMO it will be possible to locate the causes of bugs when only a single
> repository is used. And I think that Lenya's JCR Content package should be
> rather minimal, i.e. we should access the repository directly or use an
> out-of-the-box object JCR mapping framework.

Many functions previously dispersed throughout many packages are now
implemented in the Content.Flat package.  Some (most?) of this will
move to the Content package once Content.Hierarchical disappears.
Whatever remains will be repository-specific.  Whatever remains should
be in the repository-specific packages BECAUSE we believe some
repositories have features to improve on the base code.

> > 2. Low entry barrier - Anybody with a text editor can easily learn the
> > internal representation of Lenya's content.
>  IMO this shouldn't be necessary. No user should have to bother with the
> internal representation of content. If this is not the case, we have to
> improve the documentation of our API or make it more self-descriptive.

My perspective is that potential users will want the platform they can
understand.  They should not need to look at the internal
representation, but they should be able to look at it.  Think about
the difference between MSWord and OpenOffice.  You cannot easily
extract the text from MSWord's DOC format.  Unzip an OpenOffice's
Writer document and receive XML.  Being able to understand the data is
important to many people.

> > New potential techs have
> > few systems easy enough to learn and usually quickly run into a wall.
> > I learned programming games in BASIC poking bits into video memory;
> > few systems today allow the same insights into computer internals.
> > Lenya is completely open and could be a very good tool for new
> > programmers (except Java's blackboxed memory management -- great for
> > production systems, still a wall for learning computer internals.)
> > How many posts to the Lenya User ML are from people not understanding
> > the Web, HTML, and CSS?  We WANT these people as
> > customers/users/potential developers; they greatly outnumber Java
> > programmers.
>  I'm not sure if I understand this correctly - do you mean that Lenya could
> be used as a tool to learn about the nature and implementation of, e.g.,
> content management systems? I have never thought about this aspect. If we
> consider this an architectural constraint, it will certainly influence many
> aspects of the product. What do the others think about this?

ASF is known for semi-understandable reference implementations of
software.  Lenya could be the reference CMS used in college classes
about CMSes.  Alfresco has better marketing, but Lenya could be
simpler-to-understand, easier-to-customize, and might even win on
features.

> > Lenya 1.3 also allows identifiers to be specified for documents;
> > "Named Resources" required about two lines of code.  UUIDs are the
> > default, but humans prefer words.  Discouraging named resources seems
> > petty and abusive.  The identifiers must still be unique; Lenya 1.3
> > errors if the requested UNID already exists.
>  A major problem with named entities is that they are not guaranteed to be
> globally unique, which prevents merging of arbitrary repositories. I don't
> object to giving entities a name, but then it should also be possible to
> change this name, and this would again require link rewriting if the name is
> used to reference the entity.

Most of these concerns are considered in the TODO including changing
IDs.  IDs are not UNIDs.  IDs are what shows in the URL.  Three
documents could have the ID "help" allowing URLS like:
   help
   help\help
   help\help\help
to each open a different document.

UNIDs must be unique.  UNIDs were not planned to be changeable, but I
did not plan for merging repositories.  Using the flat repository,
merging two repositories could be handled by prefixing a different
character to every UNID in each repository and updating the "unid"
property of every element of the Structure documents.  Everything else
should update itself.  Should be easy, even if the UNIDs are UUIDs.
(Of course the resulting UNIDs would no longer match the UUID format.)

"Named Resources" name the UNID.  They must be unique in the Flat
repository because they are subdirectory names in a single directory;
Lenya 1.3 errors "Resource already exists" when attempting to
duplicate a UNID.  While Named Resources are allowed for content,
their primary purpose is for Design Resources (which should be
excluded from your content merge process.)

> > > > Adding a global data directory (internal or external to the Lenya
> > > > build directory) cannot lead to a solution meeting these goals.  The
> > > > fallback protocols used by Lenya 1.2 and 2.x also cause problems with
> > > > distributed content and design
> > >  Would you mind elaborating? I didn't notice any problems by now.
> > Based on reading the MLs, you (Andreas) are constantly figuring out
> > how to make the next function work with the current fallback system;
> > see the second sentence of the original post in this thread.  Not
> > using file-based fallback has simplified many improvements.
> > Developers do not care whether a file is overridden in the repository,
> > in a Publication-specific Module, an inherited Module from another
> > Publication, or a Global Module, they just call the module: protocol
> > and let Lenya 1.3 decide what file to use.
>  Unfortunately I can't comment on this since I'm not familiar with the
> internals of 1.3. I'm looking forward to an upcoming version - with the last
> version I tried, I didn't really know where to start.

Soon.  I have some minor tasks to finish before I return to Lenya
development -- fix one bug and the next commit will be much better
than the March version.  (The interim commits made a mess and will be
reverted.)

> > Would you please post a few examples where adding compiled Java via
> > Modules is useful in 2.x?  I want to better understand the benefits.
>  IMO there are some general benefits:
>  - maybe most important: separation of API and implementation
>  - very good IDE support
>  - unit testing is very simple and offers good IDE integration
>  - make use of OO concepts like inheritance and encapsulation
>  Some examples are:
>  - sitemap components (MetaDataTransformer in the metadata module)
>  - services (Notifier in the notification module)
>  - reusable convenience classes (ResourceWrapper in the resource module)
>  - implementations of core services (sitetree module)
>  - adapter code for external services (SVNKit in the forrest module)
>  -- Andreas

We will need a long thread to discuss integration.  Some of the
examples sound useful.  I am interested in how Lenya 2.0 handles
naming, versioning, and duplicate libraries e.g. having different or
same versions of SVNKit included in several Modules.  That discussion
can wait.  This thread is about fallback with content.  Please see
whether Lenya 2.0 might benefit from Lenya 1.3's content: protocol.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic