[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xsl-list
Subject:    Re: [xsl] Mixed content, separation
From:       Wendell Piez <wapiez () mulberrytech ! com>
Date:       2012-02-29 20:09:37
Message-ID: 4F4E8601.3040608 () mulberrytech ! com
[Download RAW message or body]

Hi,

On 2/29/2012 4:58 AM, David Carlisle wrote:
> The right thing is to have a list as part of the paragraph if it is part
> of the sentence. It is HTML's content model for p that is wrong.

HTML seems to think that a "paragraph" (or a "p", if "p" does not stand 
for "paragraph") is some chunk of text distinguished by vertical 
whitespace, as opposed to anything the rhetoricians or composition 
instructors call a paragraph.

Interestingly, Strunk and White (authorities on composition in American 
English still widely respected) have, even in their explanation of what 
a paragraph is and how to use it, list structures and block quotes in 
the midst of paragraphs.

http://www.bartleby.com/141/strunk5.html#9

All the mainstream documentary XML formats, including TEI, Docbook, 
NLM/NISO, and DITA, permit paragraphs to contain elements that will 
format as blocks, including tables, lists, code blocks and so forth. 
HTML does not.

The problem goes away if you regard HTML "p" as something other than a 
paragraph (perhaps a "block fragment"). Generalizing, it is apparent 
that HTML's semantics are presentational, not really descriptive in any 
reliable way of the source data, even when used well, and hence not 
really application-independent.

To bring this back on topic, this is a big reason why transforming out 
of HTML can be such a beast, as opposed to using it as a transformation 
target. In other words -- yes, it's hard to split Docbook "para" (say) 
around lists and tables to get valid HTML. But what's really hard is to 
transform back into Docbook from HTML and expect to get paragraphs back 
and not just paragraph fragments marked as "para".

Cheers,
Wendell

======================================================================
Wendell Piez                            mailto:wapiez@mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
   Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe@lists.mulberrytech.com>
--~--

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic