[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kfm-devel
Subject:    Re: Konqueror outline view plugin
From:       Peter Kelly <pmk () post ! com>
Date:       2001-11-03 11:39:19
[Download RAW message or body]

On Fri, 2 Nov 2001, Dirk Mueller wrote:

> On Fre, 02 Nov 2001, Rob Kaper wrote:
> 
> > Could this be based on the domtreeviewer plugin for Konqueror? (which, by
> > the way, should be included into kdebase and made the default view for XML
> > pages)
> 
> from the idea, yes. from the implementation: no. 
> 
> <h1> heading </h1>
> <p>
> lotsa text
> 
> 
> <p> is not nested in <h1>, hence you wouldn't hide with <h1>. DOM structure 
> doesn't help you much. 

I won't have time to code this, but here's some ideas....

What would be needed instead of a DOM-based tree would be a "structural" 
tree of the page. This could be built based on the DOM tree, but with 
certain elements having special semantics about how they fit into the 
structural page. For example, if you have a page hierarchy such as:

<h1>Section 1</h1>
  <p>text...</p>

  <h2>Section 1.1</h2>
    <p>text...</p>

  <h2>Section 1.2</h2>
    <p>text...</p>

<h1>Section 2</h1>
  <p>text...</p>

  <h2>Section 2.1</h2>
    <p>text...</p>

In this case, you would treat <h1> as starting a level 1 "section", which 
continues until the next <h1> tag, or the end of the page. Likewise, the 
<h2> tag would indicate the start of a level 2 section, which continues 
until the next <h2> or <h1>. You could build up a structure tree in this 
way, and provide a mechanism similar to the DOM tree to view it.

Other elements could also have meaning in the structure. In the case of 
tables, each new cell could start a different sub-section. This would work 
well in the case of your typical portal/news site like our favourite 
msn.com, which has links down the left-and-right columns, with content in 
the center column. This would give us four main sections - top navigation 
bar, left link bar, content and right link bar. Then the content section 
would have other sub-sections based on what headings, etc. it contains.

Analysis of the CSS styles applied to different elements may also come in 
to play here... text which is in a different/larger font, or a different 
colour could also affect how it fits in with the structure, in the case 
where the page author has used <div style="..."> or <font> to display 
headings instead of the <h*> tags.

Of course, working out the logic to do this for the large variety of pages 
out there would not be straightforward, but I would start thinking about 
what types of things would be needed in the structure (i.e. headings, 
types of navigation elements), and then work on and refine algorithms for 
creating that structure based on DOM trees. It would be mostly along the 
lines of a "best effort" approach which works pretty well for the 
majority of typical pages, but is not necessarily perfect at inferring the 
semantic structure of every page (as this is often fairly subjective).

One other thing that could be put into the navigation gui is links - you 
could search through the document to find all of the links, and add them 
as a section or pull-down menu or similar for the user to select from. So 
if they want to navigate around based on what links are on the page, 
without having to read through the page, they can do this from the 
navigation bar. The links could even be presentied hierarchically based on 
what sections they appear in.

> 
> 
> Dirk
> 

-- 
Peter Kelly
pmk@post.com

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic