[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-commits
Subject: [khtml] /: Remove unnecessary Mainpage.dox
From: Alex Merry <kde () randomguy3 ! me ! uk>
Date: 2014-01-23 15:49:36
Message-ID: E1W6MXA-0000Mk-LH () scm ! kde ! org
[Download RAW message or body]
Git commit d222bf3808490c6a34f8ecc17d43bb4133697e38 by Alex Merry.
Committed on 20/01/2014 at 22:16.
Pushed by alexmerry into branch 'master'.
Remove unnecessary Mainpage.dox
D +0 -360 Mainpage.dox
http://commits.kde.org/khtml/d222bf3808490c6a34f8ecc17d43bb4133697e38
diff --git a/Mainpage.dox b/Mainpage.dox
deleted file mode 100644
index 4af5b0b..0000000
--- a/Mainpage.dox
+++ /dev/null
@@ -1,360 +0,0 @@
-/** @mainpage KDE HTML Parser and Widget
-
-If you want a fully-fledged HTML browser widget in your application,
-you can use KHTMLPart to do so.
-
-@code
-QUrl url("http://www.kde.org");
-KHTMLPart *w = new KHTMLPart();
-w->openUrl(url);
-w->view()->resize(500, 400);
-w->show();
-@endcode
-
-For more information, see the documentation for KHTMLPart.
-
-Note that using KHTMLPart may introduce security vulnerabilities and
-unnecessary bloat to your application. Qt's text widgets are rich-text
-capable, and will interpret a limited subset of HTML.
-
-Another option is to use KDEWebKit, WebKit is a fork of KHTML with substantial
-industry support.
-
-For details on the internals of KHTML see the @ref design "Design document".
-
-@authors
-Torben Weis \<weis@stud.uni-frankfurt.de\><br>
-Josip A. Gracin \<grac@fly.cc.fer.hr\><br>
-Martin Jones \<mjones@kde.org\><br>
-Waldo Bastian \<bastian@kde.org\><br>
-Lars Knoll \<knoll@kde.org\><br>
-Antti Koivisto \<koivisto@iki.fi\><br>
-Dirk Mueller \<mueller@kde.org\><br>
-Peter Kelly \<pmk@post.com\><br>
-George Staikos \<staikos@kde.org\><br>
-Allan Sandfeld Jensen \<kde@carewolf.com\><br>
-Germain Garand \<germain@ebooksfrance.org\><br>
-Maksim Orlovich \<maksim@kde.org\><br>
-KHTML has also heavily benefited from the work of Apple Computer, Inc.
-@maintainers
-Allan Sandfeld Jensen <br>
-Germain Garand <br>
-Maksim Orlovich <br>
-Martin Sandsmark <br>
-
-@licenses
-@lgpl
-
-*/
-
-/** @page design Internal design of KHTML
-
-This document tries to give a short overview about the internal design of the
-khtml library. I've written this, because the lib has gotten quite big, and it
-is hard at first to find your way in the source code. This doesn't mean that
-you'll understand khtml after reading this document, but it'll hopefully make
-it easier for you to read the source code.
-
-@section overview High-level overview
-
-The library is build up out of several different parts. Basically, when you use
-the lib, you create an instance of a KHTMLPart, and feed data to it. That's
-more or less all you need to know if you want to use khtml for another
-application. If you want to start hacking khtml, here's a sketch of the objects
-that will get constructed, when eg. running testkhtml with a url argument.
-
-
-@section thesaurus Concepts
-In the following I'll assume that you're familiar with all the buzzwords used
-in current web techology. In case you aren't here's a more or less complete
-list of references:
-
-- <b>Document Object model (DOM):</b> <a href="http://www.w3.org/DOM/">DOM Level1 and 2</a>: \
We support DOM Level2 except for the events model at the moment.
-- <b>HTML:</b><a href="http://www.w3.org/TR/html4/">HTML4 specs</a> and <a \
href="http://www.w3.org/TR/xhtml1/">xhtml specs</a>: We support almost all of \
HTML4 and xhtml.
-- <b>Cascading style sheets (CSS):</b><a href="http://www.w3.org/TR/REC-CSS2/">CSS2 specs</a>: \
We support almost all of CSS1, and most parts of CSS2.
-- <b>Javascript:</b><a \
href="http://msdn.microsoft.com/workshop/author/dhtml/reference/objects.asp">Microsoft \
javascript bindings</a><a href="http://docs.sun.com/source/816-6408-10/index.html">Netscape \
javascript reference</a><a href="http://mozilla.org/docs/dom/domref/">Mozilla JS/DOM \
reference</a>
-
-@section example Example
-KHTMLPart creates one instance of a KHTMLView (derived from QScrollView), the
-widget showing the whole thing. At the same time a DOM tree is built up from
-the HTML or XML found in the specified file.
-
-Let me describe this with an example.
-
-KHTML makes use of the document object model (DOM) for storing the document in
-a tree like structure. Imagine some html like
-@code
-<html>
- <head>
- <style>
- h1: { color: red; }
- </style>
- </head>
- <body>
- <H1>
- some read text
- </h1>
- more text
- <p>
- a paragraph with an
- <img src="foo.png">
- embedded image.
- </p>
- </body>
-</html>
-@endcode
-
-In the following I'll show how this input will be processed step by step to
-generate the visible output you will finally see on your screen. I'm describing
-the things as if they happen one after the other, to make the principle more
-clear. In reality, to get visible output on the screen as soon as possible, all
-these things (from tokenization to the build up and layouting of the rendering
-tree) happen more or less in parallel.
-
-@subsection tokenizerandparser Tokenizer and parser
-
-The first thing that happens when you start parsing a new document is that a
-DocumentImpl* (for XML documents) or an HTMLDocumentImpl* object will get
-created by the Part (in khtml_part.cpp::begin()). A Tokenizer* object is
-created as soon as DocumentImpl::open() is called by the part, also in begin()
-(can be either an XMLTokenizer or an HTMLTokenizer).
-
-The XMLTokenizer uses the QXML classes in Qt to parse the document, and it's
-SAX interface to parse the stuff into khtmls DOM.
-
-For HTML, the tokenizer is located in khtmltokenizer.cpp. The tokenizer uses
-the contents of a HTML-file as input and breaks this contents up in a linked
-list of tokens. The tokenizer recognizes HTML-entities and HTML-tags. Text
-between begin- and end-tags is handled distinctly for several tags. The
-distinctions are in the way how spaces, linefeeds, HTML-entities and other tags
-are handled.
-
-The tokenizer is completely state-driven on a character by character basis.
-All text passed over to the tokenizer is directly tokenized. A complete
-HTML-file can be passed to the tokenizer as a whole, character by character
-(not very efficient) or in blocks of any (variable) size.
-
-The HTMLTokenizer creates an HTMLParser which interprets the stream of tokens
-provided by the tokenizer and constructs the tree of Nodes representing the
-document according to the Document Object Model.
-
-@subsection dom The DOM in khtml
-
-Parsing the document given above gives the following DOM tree:
-
-@code
-HTMLDocumentElement
- |--> HTMLHeadElement
- | \--> HTMLStyleElement
- | \--> CSSStyleSheet
- \--> HTMLBodyElement
- |--> HTMLHeadingElement
- | \--> Text
- |--> Text
- \--> HTMLParagraphElement
- |--> Text
- |--> HTMLImageElement
- \--> Text
-@endcode
-
-Actually, the classes mentioned above are the interfaces for accessing the DOM.
-The actual data is stored in *Impl classes, providing the implementation for
-all of the above mentioned elements. So internally we have a tree looking like:
-
-@code
-HTMLDocumentElementImpl*
- |--> HTMLHeadElementImpl*
- | \--> HTMLStyleElementImpl*
- | \--> CSSStyleSheetImpl*
- \--> HTMLBodyElementImpl*
- |--> HTMLHeadingElementImpl*
- | \--> TextImpl*
- |--> TextImpl*
- \--> HTMLParagraphElementImpl*
- |--> TextImpl*
- |--> HTMLImageElementImpl*
- \--> TextImpl*
-@endcode
-
-We use a refcounting scheme to assure that all the objects get deleted, in case
-the root element gets deleted (as long as there's no interface class holding a
-pointer to the Implementation).
-
-The interface classes (the ones without the Impl) are defined in the dom/
-subdirectory, and are not used by khtml itself at all. The only place they are
-used are in the javascript bindings, which uses them to access the DOM tree.
-The big advantage of having this separation between interface classes and
-imlementation classes, is that we can have several interface objects pointing
-to the same implementation. This implements the requirement of explicit sharing
-of the DOM specs.
-
-Another advantage is that (as the implementation classes are not exported) it
-gives us a lot more freedom to make changes in the implementation without
-breaking binary compatibility.
-
-You will find almost a one to one correspondence between the interface classes
-and the implementation classes. In the implementation classes we have added a
-few more intermediate classes, that can not be seen from the outside for
-various reasons (make implementation of shared features easier or to reduce
-memory consumption).
-
-In C++, you can access the whole DOM tree from outside KHTML by using the
-interface classes. For a description see the <a
-href="http://developer.kde.org/documentation/library/kdeqt/kde3arch/khtml/index.html">introduction
-to khtml</a> on <a href="http://developer.kde.org/">developer.kde.org</a>.
-
-One thing that has been omitted in the discussion above is the style sheet
-defined inside the <code><style></code> element (as an example of a style
-sheet) and the image element (as an example of an external resource that needs
-to be loaded). This will be done in the following two sections.
-
-@subsection css CSS
-
-The contents of the <code><style></code> element (in this case the
-<code>h1 { color: red; }</code> rule) will get passed to the
-HTMLStyleElementImpl object. This object creates an CSSStyleSheetImpl object
-and passes the data to it. The CSS parser will take the data, and parse it into
-a DOM structure for CSS (similar to the one for HTML, see also the DOM level 2
-specs). This will be later on used to define the look of the HTML elements in
-the DOM tree.
-
-Actually "later on" is relative, as we will see later, that this happens partly in parallel to
-the build up of the DOM tree.
-
-@subsection external Loading external objects
-
-Some HTML elements (as <code><img>, <link>, <object>,
-etc.</code>) contain references to external objects, that have to be loaded.
-This is done by the Loader and related classes (misc/loader.*). Objects that
-might need to load external objects inherit from CachedObjectClient, and can
-ask the loader (that also acts as a memory cache) to download the object they
-need for them from the web.
-
-Once the loader has the requested object ready, it will notify the
-CachedObjectClient of this, and the client can then process the received data.
-
-@subsection showing Making it visible
-
-Now once we have the DOM tree, and the associated style sheets and external
-objects, how do we get the stuff actually displayed on the screen?
-
-For this we have a rendering engine, that is completely based on CSS. The first
-thing that is done is to collect all style sheets that apply to the document
-and create a nice list of style rules that need to be applied to the elements.
-This is done in the CSSStyleSelector class. It takes the default HTML style
-sheet (defined in css/html4.css), an optional user defined style sheet, and all
-style sheets from the document, and combines them to a nice list of parsed
-style rules (optimised for fast lookup). The exact rules of how these style
-sheets should get applied to HTML or XML documents can be found in the CSS2
-specs.
-
-Once we have this list, we can get a RenderStyle object for every DOM element
-from the CSSStyleSelector by calling "styleForElement(DOM::ElementImpl *)".
-The style object describes in a compact form all the CSS properties that should
-get applied to the Node.
-
-After that, a rendering tree gets built up. Using the style object, the DOM
-Node creates an appropriate render object (all these are defined in the
-rendering subdirectory) and adds it to the rendering tree. This will give
-another tree like structure, that resembles in it's general structure the DOM
-tree, but might have some significant differences too. First of all, so called
-<a
-href="http://www.w3.org/TR/REC-CSS2/visuren.html#anonymous-block-level">anonymous
-boxes</a> - (see <a href="http://www.w3.org/TR/REC-CSS2/">CSS specs</a>) that
-have no DOM counterpart might get inserted into the rendering tree to satisfy
-DOM requirements. Second, the display property of the style affects which type
-of rendering object is chosen to represent the current DOM object.
-
-In the above example we would get the following rendering tree:
-@code
-RenderRoot*
- \--> RenderBody*
- |--> RenderFlow* (<H1>)
- | \--> RenderText* ("some red text")
- |--> RenderFlow* (anonymous box)
- | \--> RenderText* ("more text")
- \--> RenderFlow* (<P>)
- |--> RenderText* ("a paragraph with an")
- |--> RenderImage*
- \--> RenderText* ("embedded image.")
-@endcode
-
-A call to of layout() on the RenderRoot (the root of the rendering tree) object
-causes the rendering tree to layout itself into the available space (width)
-given by the KHTMLView. After that, the drawContents() method of KHTMLView can
-call RenderRoot->print() with appropriate parameters to actually paint the
-document. This is not 100% correct, when parsing incrementally, but is exactly
-what happens when you resize the document.
-
-As you can see, the conversion to the rendering tree removed the head part of
-the HTML code, and inserted an anonymous render object around the string "more
-text". For an explanation why this is done, see the CSS specs.
-
-@subsection structure Source code directory structure
-
-A short explanation of the subdirectories in khtml.
-- css/: Contains all the stuff relevant to the CSS part of DOM Level2
- (implementation classes only), the CSS parser, and the stuff to create
- RenderStyle object out of Nodes and the CSS style sheets.
-- dom/: Contains the external DOM API (the DOM interface classes) for all of the DOM.
-- ecma/: The javascript bindings to the DOM and khtml.
-- html/: The html subpart of the DOM (implementation only), the HTML tokenizer
- and parser and a class that defines the DTD to use for HTML (used mainly in
- the parser).
-- java/: Java related stuff.
-- misc/: Some misc stuff needed in khtml. Contains the image loader, some misc
- definitions and the decoder class that converts the incoming stream to
- unicode.
-- rendering/: Everything thats related to bringing a DOM tree with CSS
- declarations to the screen. Contains the definition of the objects used in
- the rendering tree, the layouting code, and the RenderStyle objects.
-- xml/: The XML part of the DOM implementation, the xml tokenizer.
-
-@subsection exceptions Exception handling
-
-To save on library size, C++-exceptions are only enabled in the dom/ subdirectory,
-since exceptions are mandated by the DOM API. In the rest of KHTML's code,
-we pass an error flag (usually called "exceptionCode"), and the class that
-is part of dom/* checks for this flag and throws the exception.
-
-@subsection epilogue Final words...
-
-All the above is to give you a quick introduction into the way khtml brings an HTML/XML file \
to the screen.
-It is by no way complete or even 100% correct. I left out many problems, I will perhaps add \
either on request
-or when I find some time to do so. Let me name some of the missing things:
-
-- The decoder to convert the incoming stream to Unicode
-- interaction with konqueror/applications
-- javascript
-- dynamic reflow and how to use the DOM to manipulate khtmls visual output
-- mouse/event handling
-- real interactions when parsing incrementally
-- java
-
-Still I hope that this short introduction will make it easier for you to get a first hold of \
khtml and the way it works.
-
-Now before I finish let me add a small <b>warning</b> and <b>advice</b> to all of you who plan \
hacking khtml themselves:
-
-khtml is by now a quite big library and it takes some time to understand how it works. Don't \
let yourself get frustrated
-if you don't immediately understand how it works. On the other hand, it is by now one of the \
libraries that
-get used a lot, that probably has the biggest number of remaining bugs (even though it's \
sometimes hard to
-know if some behavior is really a bug).
-
-Some parts of it's code are however <b>extremely touchy</b> (especially the layouting \
algorithms),
-and making changes there (that might fix a bug on one web page) might introduce severe bugs.
-All the people developing khtml have already spend huge amounts of time searching for such \
bugs,
-that only showed up on some web pages, and thus were found only a week after the change that
-introduced the bug was made. This can be very frustrating for us, and we'd appreciate if \
people
-that are not completely familiar with khtml post changes touching these critical regions to \
kfm-devel
-for review before applying them.
-
-And now have fun hacking khtml.
-
-Lars
-
-*/
-
-// DOXYGEN_REFERENCES = kdecore kdeui kio kparts kjs
-// DOXYGEN_EXCLUDE = test*.* html rendering xml misc ecma css imload test
-// DOXYGEN_SET_PROJECT_NAME = KHTML
-// vim:ts=4:sw=4:expandtab:filetype=doxygen
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic