On Thursday 25 October 2012 12:20:42 Mark wrote: > So i would end up with a: > struct SortEntry > { > QByteArray collatingKey; > KFileItem *fileItem; > }; >=20 > Where the collatingKey is meant to be what? Let's start with comparing using strcmp. It works like this, omitting=20 details: while (*s1 && *s2 && *s1 =3D=3D *s2) { ++s1; ++s2; } return *s1 - *s2; As you see, the code comparing each character is very fast, but does=20 not handle anything special, in particular it does not =2D compare case insensitive =2D sort german '=E4' after 'a' but before 'b' ("locale aware") =2D sort 'a12' after 'a2' ("natural sorting") The solution to the first inability is known to you: You convert each=20 string to lower case before comparing them. This converted string is=20 the actual "sort key", i.e. what you use for the sort's "lessThan"=20 functor: 'A' -> 'a' 'B' -> 'b' 'a' -> 'a' Now if you sort ('A', 'B', 'a'), you get ('A', 'a', 'B') after looking=20 up the actual item which you stored within the "SortEntry". This very same idea can be applied to tackle the second inability. You=20 convert each string to its locale-dependend "collating" string, often=20 just called a "sequence", because it ususally does contain non- characters. Here are example keys that sort 'a' < '=E4' < 'b': 'a' -> 'a' 'b' -> 'b' '=E4' -> 'a' '\0377' The appended 0xFF character ensures that '=E4' is always after 'a',=20 regardless of which other character follows. But it will never be=20 after 'b', because of the first 'a' in the sort key. If you look up the Unicode collating algorithm, you will see that it=20 is much more complicated, but the basic idea is the same. It should=20 not bother you for the initial version; you can simply use glibc=20 function "strxfrm()" to get the collating sequence for your string=20 parts where you would call "localeAwareCompare" on. Also the last inability can be solved in the same way. Simply=20 _prepend_ a code which states the magnitude (number of digits).=20 Example: 'a1234' -> 'a' \004' '123' 'a56' -> 'a' '\002' '56' 'a8' -> 'a' '\001' '8' A combined algorithm could, for example, return these collating=20 sequences as sort keys: 'a12' -> 'a' '\002' '12' '=E48' -> 'a' '\0377' '\001' '8' 'A6' -> 'a' \001' '6' '=C456' -> 'a' '\0377' '\002' '56' If you sort by those keys, you get ('A6', 'a12', '=E48', '=E456'), which=20 might be what you want, but you could also want ('A6', '=E48', 'a12',=20 '=E456'), which would require a different algorithm for generating the=20 sort key. Christoph Feck (kdepepo)