[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freetype
Subject:    Re: freetype & Xfree86 ?
From:       Robert Wilhelm <robert () physiol ! med ! tu-muenchen ! de>
Date:       1997-06-18 5:50:29
[Download RAW message or body]


Hi David!

> Did you receive it ? I thought the mail had bounced because it was too
> large ?

I have resend it. 

> 
> I was speaking about the draft on memory and object management that I sent
> about two weeks ago.. I just installed cvs 1.9 on my account. Unfortunately,
> it seems even slower than the previous version !! How is this possible ?
> I have erased my local repository to get a fresh version, but it seems that
> it will need more than 3 hours to get it ( it's the time it took the last
> time I tried it !! )..
>

I have added the draft memory.txt to the repository. Can you please
check whether this is the latest version. 
 
Your cvs problems are propably because of the slow connection.
Did you use -z9 to do the transfers compressed?

> I will send a diff tomorrow anyway, ( with some other improvements.. )
> 

Just drop me a line and I will apply your patches to the repository.
My cvs access i very fast :-)


Robert

============================================================================
----------------------------------------------------------------------------


Proposal for a coherent and extensible object memory management model for
the FreeType engine. Draft 0.

-------------------------------------------------------------------------

Introduction :

  FreeType's engine's current implementation uses a  simple  growing
  heap, called the  "Font  Pool".   This  pool  allows easy but very
  limited memory management.  Now that we have  introduced  features
  to  support  multiple  opened fonts and even re-entrant builds, it
  becomes important to be able  to  manage several objects with much
  more flexibility.   This  text  is  an  overview  of  the  current
  problems  that  we're  now faced with, as well as a proposal for a
  new memory model.  This document  is  named 'draft 0', which means
  that you're welcome to add your comments and suggestions..

I. The problems :

  A TrueType font file  is  reflected  in  memory  by a large set of
  structures of various kinds and extents.  Each of  this  structure
  can  be seen as an 'object' which must be managed according to its
  function and some choices we may make in the process of developing
  the library.  Because  the  FreeType  library  should be as easily
  extended as possible, we must provide an object management  scheme
  that is both simple and extensible.

  We'll start by classifying the data that we  manage  into  several
  groups  to present many problems introduced by the requirements of
  extensibility.

  a. According to domain :
  
    It is first important to separate our data depending on some few
    high-level abstractions that  reflect  the  goal of the library.

   - font/face data ( a.k.a resident data ) :

       This  label  groups  all the data that is relative to a given
       face, regardless  of  device  resolutions  or specific sizes.

       In our current implementation, this  data is a tree of object
       which root is  a  'TResident_Record'.   The  name  'resident'
       comes from the fact that I thought, initially, that face data
       could  be  created and loaded in a single step, and would not
       require additional  i/o  operations  later  ( unlike instance
       data, which needs much 'on-the-fly loading' ).   Though  this
       is  still the case, some new introductions could require more
       flexibility.

       Examples :  Font Header, Maximum Profile,  OS/2  Table,  Font
                   Program,   CVT   Program,   Control  Value  Table
                   (unscaled),   Kerning,   horizontal   &  vertical
                   headers, etc ...

   - pointsize data ( a.k.a instance data ) :

       This data is  specific  to  a  given device resolution and/or
       point size.  This regroups a lot of data that must be  loaded
       on  demand,  depending  on some attributes like glyph number,
       character mapping table, pixels per  EM, etc ..  The sizes of
       the required instance objects can  change  greatly  from  one
       rendering/loading/computation from another.
       
       A  very  straightforward implementation would lead to a great
       number of calls to the  'malloc' and 'free' functions, with a
       wide distribution of memory blocks sizes.  This usually leads
       to early fragmentation, a tad slower performance, and  memory
       'spreading'.

       However, one very nice thing in the TrueType spec is that the
       'Maximum  Profile' table gives a lot of information that help
       in computing the maximum  size  of a face's instance objects.
       It is thus possible to  pre-allocate  enough  space  for  all
       instance   objects  and  re-use  it  for  each  new  instance
       operation.

       In our current  implementation,  instance  data  is a tree of
       objects which root is a 'TInstance_Record'.We  currently  use
       the  growing  heap to allocate all instance objects after the
       record, in what is called the 'instance segment'.

       Examples :  scaled CVT,  coordinates and flag arrays, contour
                   arrays, instruction  coderange,  graphics  state,
                   device  resolutions, compensations, pointsize and
                   ppem dependent tables ..

   - transient data ( like execution contexts ) :

       Some data is only necessary  for some specific operation, and
       doesn't need to be kept with other font or instance  objects.

       For  example,  the execution stack used to interpret TrueType
       opcodes  during  hinting  typically  needs  about  7  Kb. Now
       imagine  a  desktop  publishing  application,   or   a   word
       processor, that can easily keep more than a dozen opened font
       instances  for  its  work.   If we give each instance its own
       execution stack, we'll take  more  than  80 Kb of memory that
       will be mainly unused, as hinting on all  instances  will  be
       very  rarely,  if  ever,  requested at the same moment ( even
       assuming a completely re-entrant build of the library..  ).

       It is much better in this case, to be  able  to  ask  for  an
       execution  stack  on-demand,  when a new hinting operation is
       queried.

       On a larger scope, this  sort  of data is called 'transient',
       though  transience  doesn't  represent  a  domain   entirely.
       Transient  data  is used only on specific purpose.  It should
       be possible  to  manage  it  efficiently,  with  schemes like
       caching.

       That's the reason why the 'execution context' record has been
       recently  introduced  in  the  implementation.   Though   all
       transient  data is currently allocated with the instance one,
       it should be later possible to change its management.


  b. According to size/storage :

    - static data :

      These are structs usually  defined  in  the TrueType specs, or
      for our own purposes.  Their size and layout is fixed  and  is
      the same for all font objects that use them.  They may contain
      pointers to some  other  tables,  or  simply some other static
      structures..

      Examples : Font Header, Maximum Profile, Horizontal Header,
                 Graphics State, Execution Context, Resident Record,
                 Instance Record ...


    - linear data :

      These  are  arrays  of  simple  static  data.   Their  size is
      proportional to one of the font's properties, like the  number
      of glyphs, the maximum number of points, etc..

      Example : Glyph Locations, Coordinates and Flags arrays,
                Contour array, ...


    - attribute data :

      Data  found  in  the TrueType file in a simple form, but which
      size varies with the nature and origin of the font.  Attribute
      data  is  usually  put  in  the  resident  data.   Note  :  It
      *doesn't* contain pointers !!  Attribute data can be shared by
      instances, or replicated ( e.g. the CVT :  while the  original
      is  kept  in  the resident segment, scaled tables are found in
      the instance data..  ).

      Note that the objects that are created to store all iterations
      of  instance data operations ( like glyph instructions arrays,
      execution stacks,  etc  ..   )  are  classified as 'attribute'
      objects too.
        However, the data that usually comes from the font file, and
      is stored in the attribute objects ( e.g. glyph instructions )
      are not attributes.

      'Attribute' is a common label for all properties shared by all
      instances of a font, though their size isn't either static  or
      linear.   A glyph's instruction set is not a common property (
      it's even glyph dependent,  not instance dependent ), however,
      the maximum size of the glyph code range _is_ ..

      Example : Font program, CVT program, Control Value Table, ...


    - hybrid data :

      All data that doesn't come in one of the three previous forms.
      Usually, hybrid data are made of a static part and one or more
      linear parts of varying sizes.

      Examples : HMTX Table, Character Mappings, Embedded bitmaps...

      It is sometimes difficult to pre-compute the maximum  size  of
      some   of   the  hybrid  data.   The  memory  structures  that
      manipulate them must then be coded with greater care.


  c. According to management :
     
    Several different management  issues  arise,  depending  on  the
    nature and use of some of our objects :

    - Shared objects :

      We need a scheme that let us share some objects  easily.   For
      example, all instances of a same face point to the same parent
      resident  record.   It  is important to see that we don't know
      completely how the FreeType engine  is going to be extended in
      the future; this means  that  among  all  of  the  welll-known
      techniques  used  to manage sharing, one should focus on these
      that can provide a  good separation between implementation and
      semantics.
      
    - Cached objects :

      As  seen  previously  in the description of transient data, it
      may be  necessary  or  useful  to  cache  some  important data
      structures.  As for sharing, it should be noted that the  most
      interesting  techniques  are those that allow us to extend and
      change the management of various objects in the simplest way.


II. Some proposed solutions :

  1. Separating implementation from use whenever possible :

    It is, IMHO, very important  to  be  able to separate the way we
    use our objects  from  the  way  they  are  indeed  implemented.

    For example, as we're heading towards  multiple  builds  of  the
    same  library  (  singly-threaded,  thrad-safe  and re-entrant )
    through the use of macros, it is  much better to be able to have
    as few #ifdefs as possible.  Not only  do  they  make  the  code
    difficult  to read and understand, they also make it much harder
    to maintain and extend..  Only by carefully selecting our macros
    and their use can we craft  a convenient engine.  It probably is
    a  bad  thing to cast details about sharing and caching within a
    lot of routines when it's  perfectly  possible to gather them in
    specific portions of the code that are called by the rest of the
    library. The following proposals are inspired by this idea :

  a. Implementing sharing and caching of objects :

    Consider the case of tracking the instances of a given face.  We
    could be happy by managing a simple reference  counter  in  each
    TResident_Record.   The counter would be incremented on instance
    creation,  and  decremented  on  instance  deletion.   A  zeroed
    counter would imply an  unused  face  object  that could then be
    discarded.  A re-entrant build  would  require  the  use  of  an
    additionnal mutex to protect the counter, but this wouldn't need
    that much work.

    However, we may also need to maintain a list of these instances,
    to be able to close all  of  them  at once when desired ( either
    through an API call  like  "Close_Face",  or  when  the  library
    detects  a  broken  or corrupt font file ).  This is a different
    management choice.  Both of these choices have their benefit and
    cost, and could be required by  the nature of the system we want
    to write a font server for..

    The idea is to be able to change the implemented scheme  easily.

    In order to do that, I recommend that we define some simple
    specific  components,  called  'managers', that could be used to
    perform the  low-level  tracking  operation,  while  providing a
    simple API to the rest of the library, e.g. an 'instance' manager
    could provide the following functions :

       PInstance_Record  New_Instance( PResident_Record face );
       // Return a new instance record for a given face
       // The record is the root of a tree of instance objects
       // that are allocated/recycled by the manager..

       void  Done_Instance( PInstance_Record  this );
       // Discards or recycles an instance's data. The owner
       // resident data can be discarded when zerothed depending
       // on the chosen implementation within the manager

       PExecution_Context New_Execution( PInstance_Record  instance );
       // Queries a new execution context for a given instance object
       // This should be called before a new execution. The execution
       // context could be allocated on the fly

       void  Done_Execution( PInstance_Record  instance );
       // Discards or recycles an execution context previously
       // queried through 'New_Execution'.

   The implementation of these functions could  deal  with  all  the
   details  regarding  sharing  and  caching,  while the rest of the
   library would be clear of these considerations.

   Note that these functions have nothing to do with the loading  or
   interpretation  of  TrueType  data.  The managers functionalities
   should  focus  on   the   sole   purpose   of  object  management
   (allocation/sharing/caching/recycling).


  b. Using generic lists :

    The previous example has shown that  we  may not be able to know
    if we'll need to track closely some structures or  not.   That's
    why I recommend the use of generic containers whenever possible.

    When putting objects in lists, either singly  or  doubly  linked
    ones,  one  can  either  insert  some  link  pointer  within the
    objects, or use generic  link  elements.  The latter solution is
    preferred, as it eases the extension of the library.   Moreover,
    it  becomes possible to use a single list manager, independently
    from the linked objects' types and functions.

           ______      ______      ______
          |      |    |      |    |      |
       -->| Next----->| Next----->| Next---|0 (NULL)
          |      |    |      |    |      |
          | Data |    | Data |    | Data |
          |__|___|    |__|___|    |__|___|
             |           |           |
             |           |           |
             v           v           v
        __________   __________    __________
       |          | |          |  |          |
       | Object   | | Object   |  | Object   |
       |          | |          |  |          |
       |__________| |__________|  |__________|

     Example : Use of a generic list. The list is made of simple
               linked 'List_Element' structures, containing fields
               like a 'Next' link, and a 'Data' field used to  point
               to the actual object.

     It  is  important that only the manager modifies or even parses
     list elements (this issue is critical for reentrant builds).
 
     The objects do not  necessarily  need  a  pointer to their list
     element ( which  could  be  added  in  the  object's  structure
     definitions,  though  modified  by  the  manager  only  ).  For
     example, these objects could be instance records, which already
     contain a pointer to their  common resident record.  The latter
     could then contain the head of the linked list..

     And it finally should be  noted  that  we  may  introduce  some
     additional  fields  into  the  list elements to hold some extra
     information,  like  flags   or  optional  destructor  functions
     pointers for the pointed objects..

     One  should  easily  alternate  between singly or doubly linked
     lists if  required  without  touching  anything  else  than the
     manager (and eventually the objects' type declaration to add or
     remove some fields..).

   c. Add your own suggestions here..


  2. Memory Allocation and Object Initialisation/Destruction :

    The data that is created and used to manage a specific domain
    concept ( like a face, or an instance ), is represented by a tree
    of objects. For example, the resident data is made of a root
    TResident_Record, which contains pointers to some other tables,
    which may in turn contain pointers to some sub-tables
    ( like TResident_Record->CMap_Directory->CMap_Table.. ).
    
    When we want to delete a face, we have to parse the tree in
    a depth-first order to delete all sub-tables, which means that
    destruction is not a trivial operation. One must know the structure
    and layout of the implemented hierarchy of objects to do it right.

    If we want to be able to extend the hierarchy easily, it is a good
    idea to isolate 'destructor' functions.

    Now, consider the fact that, as we may load a lot of information
    on-demand, some tables may, or may not, be present in memory.
    Consider also the case when we discover that a font file is broken,
    while loading one table and then must decide to destroy what was
    previously built. There are many ways to do that. Here is one :

    ( Note : this is fictious code.. )

    Bool  Load_Resident_Table( TT_Stream stream, PResident_Record* res )
    {
      ....

      if ( !Load_TrueType_Directory( resident ) )
          goto Error_1;

      if ( !Load_TrueType_Header( resident ) )
          goto Error_2;

      if ( !Load_TrueType_MaxProfile( resident ) )
          goto Error_3;

      if ( !Load_TrueType_CMaps( resident ) )
          goto Error_4;

      ....

      // success, do the following work

      res = resident;
      return SUCCESS;

    Error_n:
       ...

    Error_5:
       Free_CMaps( resident );

    Error_4:
       Free_MaxProfile( resident );

    Error_3:
       Free_Header( resident );

    Error_2:
       Free_Directory( resident );

    Error_1:
       // nothing to free

       res = NULL;
       return FAILURE;
    }       
   
    One can see that, in this example, allocations are 'unrolled' in
    the case of an error. Though this is elegant, it requires an additional
    function to destroy the resident data normally when a face's life-cycle
    ends. This means another routine, that should look like :

    void Free_Resident( PResident_Record res )
    {
      ....
      Free_CMaps     (resident);
      Free_MaxProfile(resident);
      Free_Header    (resident);
      Free_Directory (resident);
    }

   This means that we now have two iterations of a similar code. If we
   now decide to add a new table to the resident record, we'll have to add
   a call to its loader in 'Load_Resident_Table', and two calls to its
   destructor in both the loader and the destructor. That isn't really a
   good idea.

   Now imagine the case of more dynamic data, where some sub-tables may,
   or may not be there, depending on various and unpredictable factors
   ( font file contents, user calls, etc .. ). This introduces some
   complexities in the destructor that are not necessarily duplicated
   in the loader. Now, we would have two different destruction schemes
   for the same data. This is still a bad thing.

   That's why I think that each table/structure that is not static needs
   a single destructor, that must be able to handle partially built data.
   
   Something that should look like :

   void Free_Resident( PResident_Record res )
   {
     ...

     if ( res->cmaps_table )
     {
       Free_CMaps( res->cmaps_table );
       res->cmaps_table = NULL;
       res->numCMaps    = 0;
     }

     if ( res->maxProfile )
     {
       Free_MaxProfile( res->maxProfile );
       res->maxProfile = NULL;
     }

     if ( res->fontHeader )
     {
       Free_Header( res->fontHeader );
       res->fontHeader = NULL;
     }

     ...

   }

   Using a simple convention : 
      a NULL pointer indicates that the table isn't there. Otherwise,
      it should be destroyed, and other fields associated to the table
      should be set to 0 ( like the 'res->numCMaps' ).

   The Loader could also become simpler :

   {
     ...

     resident = New_Resident_Header();
     // Get a new ( fresh or recycled ) resident header

     Init_Resident_Header( resident );
     // Inits the resident header

     if ( ! Load_Table1( resident )  ||  // try to load the sub-tables
          ! Load_Table2( resident )  ||
          ! Load_Table3( resident ) )
     {
       // an error occured
       Free_Resident_Header( resident );
       *res = NULL;
       return FAILURE;
     }

     // success

     *res = resident;
     return SUCCESS;
   }


   Here, we can see that :

     1. We obtain the a new resident header through a function call
        ( or a macro ), which is better than using ALLOC or (malloc)
        directly.

     2. We used a function 'Init_xx', that I would call a 'constructor'
        which main role is to clear all fields of the resident record,
        especially those that contain pointers. The constructor can also
        contain some other functionalities, like vreating a new mutex
        for the object in a reentrant build. These are important tasks,
        but I don't think that they should be performed by the loader
        function directly.

   So the idea is to declare, for every table that might contain pointers
   to an owned subtable, several functions :

   - an Init_xxxxxx constructor, used to set all fields to 0 or NULL.

     The constructor can be a macro that casts a simple
     'memset( table, sizeof(*table), 0 )'. But it's good to have the
     ability to change it for the sake of extensibility.

     In some cases, it's simply better to allocate and load the tables
     at the same time, in the same routine (that's what we're currently
     doing). If we decide to keep this technique for some of our functions,
     we'll need however a decent destructor..

   - a Free_xxxxx destructor, that should test the availability of
     each sub-table pointer before freeing them (this issue is important
     in the cases of partial and dynamic allocations).

   and eventually :

   - some life-cycle functions like New_xxxxx and Done_xxxxx that would
     allow later changes in the way we manage some of these tables.
     They are not a requirement for all tables but can be very useful.

   I think that the names of these functions are quite expressive, but we
   could also find some other ones.

   Talking about migration, I think it should be fairly easy to begin
   writing the Init_xxx and Free_xxx functions for our common non-static
   tables. That would probably a great step in the right direction.

   I don't think we'll need to touch a lot of code. We already have some
   ALLOC macros that can be changed very easily to get rid of the growing
   heap..


To be continued ..





    

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic