[prev in list] [next in list] [prev in thread] [next in thread]
List: kfm-devel
Subject: Re: Konqueror outline view plugin
From: Peter Kelly <pmk () post ! com>
Date: 2001-11-03 11:39:19
[Download RAW message or body]
On Fri, 2 Nov 2001, Dirk Mueller wrote:
> On Fre, 02 Nov 2001, Rob Kaper wrote:
>
> > Could this be based on the domtreeviewer plugin for Konqueror? (which, by
> > the way, should be included into kdebase and made the default view for XML
> > pages)
>
> from the idea, yes. from the implementation: no.
>
> <h1> heading </h1>
> <p>
> lotsa text
>
>
> <p> is not nested in <h1>, hence you wouldn't hide with <h1>. DOM structure
> doesn't help you much.
I won't have time to code this, but here's some ideas....
What would be needed instead of a DOM-based tree would be a "structural"
tree of the page. This could be built based on the DOM tree, but with
certain elements having special semantics about how they fit into the
structural page. For example, if you have a page hierarchy such as:
<h1>Section 1</h1>
<p>text...</p>
<h2>Section 1.1</h2>
<p>text...</p>
<h2>Section 1.2</h2>
<p>text...</p>
<h1>Section 2</h1>
<p>text...</p>
<h2>Section 2.1</h2>
<p>text...</p>
In this case, you would treat <h1> as starting a level 1 "section", which
continues until the next <h1> tag, or the end of the page. Likewise, the
<h2> tag would indicate the start of a level 2 section, which continues
until the next <h2> or <h1>. You could build up a structure tree in this
way, and provide a mechanism similar to the DOM tree to view it.
Other elements could also have meaning in the structure. In the case of
tables, each new cell could start a different sub-section. This would work
well in the case of your typical portal/news site like our favourite
msn.com, which has links down the left-and-right columns, with content in
the center column. This would give us four main sections - top navigation
bar, left link bar, content and right link bar. Then the content section
would have other sub-sections based on what headings, etc. it contains.
Analysis of the CSS styles applied to different elements may also come in
to play here... text which is in a different/larger font, or a different
colour could also affect how it fits in with the structure, in the case
where the page author has used <div style="..."> or <font> to display
headings instead of the <h*> tags.
Of course, working out the logic to do this for the large variety of pages
out there would not be straightforward, but I would start thinking about
what types of things would be needed in the structure (i.e. headings,
types of navigation elements), and then work on and refine algorithms for
creating that structure based on DOM trees. It would be mostly along the
lines of a "best effort" approach which works pretty well for the
majority of typical pages, but is not necessarily perfect at inferring the
semantic structure of every page (as this is often fairly subjective).
One other thing that could be put into the navigation gui is links - you
could search through the document to find all of the links, and add them
as a section or pull-down menu or similar for the user to select from. So
if they want to navigate around based on what links are on the page,
without having to read through the page, they can do this from the
navigation bar. The links could even be presentied hierarchically based on
what sections they appear in.
>
>
> Dirk
>
--
Peter Kelly
pmk@post.com
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic