'Re: The next file format'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-edu-devel
Subject:    Re: The next file format
From:       Bruno Coudoin <bruno.coudoin () gcompris ! net>
Date:       2014-08-26 23:54:30
Message-ID: 53FD1E36.2050603 () gcompris ! net
[Download RAW message or body]

Le 26/08/2014 21:51, Andreas Xavier a =E9crit :
> Hello Bruno,
>
> I am working on the coding of our new common file-handling library.
> I have read the two websites that you referenced and I will comment
> on your email below.
>
> I browsed some of the code at https://git-next.kde.org/kde/gcompris.
> There are many activities (~100), congratulations.
Thanks, we are at 79 activities on the 140 of the Gtk+ version.
>   They are ported to
> QT5, double congratulations.  I looked at 3, penalty, missing letter
> and clockgame to try and understand your requirements.
>
> It looks like gcompris is looking for a common method to store
> semantically disparate resources to provide a uniform interface to
> the activities resources and common distribution.  Judging only
> from the resources that are designated qrc:/location, you will be
> storing activity backgrounds and source code files etc.  If I have
> misunderstood what you want to put in the data repository, then
> some of my concerns below are inappropriate.
>
> I think we are trying to do something slightly different.
> We are trying to store information that has semantic meaning
> common to all the applications.  We are not trying to
> store application specific information like backgrounds, cursors etc.
> We expect the information to be re-usable or of interest to more than
> one application.
Well, what you saw are the activity themselves. Each one is bundled in =

an rcc file that contains a manifest (ActivityInfo.qml)  a set of qml, =

javascript, image and audio file. They are then loaded by the GCompris =

binary at runtime. On Linux with the 79 activities the GCompris binary =

is only 300KB and we have 79 rcc for each activities that takes 19MB. =

BTW, these activity rcc could easily be distributed through a web server =

either for updates or for the initial version. We could bypass the 'slow =

update' cycle of linux distributions but this is another story.

The subject is about sharing and distributing the content of some =

activities. If you look at the 'missing letter' activity there is a =

javascript file missing-letter.js than contains the dataset of the =

activity in json. We don't have this wet but the goal if to let a =

teacher create  and share a dataset and assign it to its student:
https://git-next.kde.org/kde/gcompris/blob/master/src/activities/missing-le=
tter/missing-letter.js

>
> I do think that we are overlooking some of our own application
> specific differences particularly in the definition of the courses with
> lessons/units.  Perhaps a method to designate application specific
> information, that is blackbox, handled by a application provided
> editor and otherwise ignored is a solution.
Hum, if we try to make a dataset format that suits all the needs it will =

be at the expense of its expressiveness.

>   =

>
>  From the terminology that you use on the data handling page,
> "Dataset editors are not forcibly only activity-specific." I think that y=
ou
> are well aware of these issues.
>
> Anyway, if we proceed to merge these it would be helpful if you could
> pick out an application to use as a target.  I was planning on using
> KAnagram, Artikulate, Parley and Parley's editor as targets of increasing
> feature richness.  Ideally, a good target would be a superset of the
> features gcompris expects from the new library.
You can take 'missing letter' as an example but I like this one which is =

more javascript than json:
https://git-next.kde.org/kde/gcompris/blob/master/src/activities/memory-wor=
dnumber/dataset.js

>
>> This may be list of words for a hangman, letters for a typing tutor,
>> images and voices for language learning tools, a text with holes for a
>> reading exercises, ...
>>
>> As you can see the type of exercises are very different and we cannot
>> end up with a dataset structure common to all of them. Also, an
>> important part of the task is to provide a way for teachers to create
>> datasets, assign them to children and if they want share them.
>>
>> Based on our requirements we ended up with a a different proposal than
>> yours but we are also in the early stage on it, Holger just wrote what
>> we came up with in Randa on our wiki:
>> http://gcompris.net/wiki/Dataset_handling
>>
>> As you can see in our idea we define a 'datatype' which would be common
>> to all and a 'payload' which would be readable only by a given activity
>> and and editor following its mime type. Thus the whole infrastructure we
>> can set up to manage datasets is not specific to a given type of exercis=
e.
> I have a concern here, that I will gently raise.
>
> As you pointed out, some data types have natural semantics, which makes
> generalizing them into a type that can be re-used by many applications ea=
sy:
> Alphabets, words, grammar, spoken words, sets of things.  My question is
> if mixing application specific information with more general semantically
> useful information is what people want.  I think this is also CoLa's conc=
ern
> with my desire to include vocabulary structure (i.e grammar) in the file =
format.
I agree that you need there is an important design work to do on a =

language dataset format. We may want to create dataset for a geometry =

activity where a teacher request children specific forms to create. It =

will be hard to come up with a single dataset format.

That is way I am more interested in a dataset container that we can all =

share and a dataset format specific to a set of activities.
>> Also we have not mentioned it in this wiki page but we are already
>> distributing in the new GCompris voice files as Qt qrc files. They are
>> Qt specific but very easy to manage because you can load them
>> dynamically and then access their content through qrc:// url anywhere in
>> Qml. To us, 'qrc' is good candidate for the container of the datasets as
>> it is Qt native.
> Qrc works well. If the data is intended to be re-used by multiple applica=
tions
> it needs to be external to the application, perhaps in the zip.
Yes, of course we are talking about  external binary resources:
http://qt-project.org/doc/qt-5/resources.html#external-binary-resources
>
>> Some feedback on your proposal, I am confused by the 'confidence level'.
>> If it is a student mark, it may not be desirable to put it in the
>> dataset itself because it make sense to have it on a read only storage
>> area (most distros will do that). On this topic at GCompris we are
>> interested in a teacher specific tool to help them in their daily usage,
>> we starting specifying it there :
>> http://gcompris.net/wiki/Administration_design
>>
> I think Inge explained this elsewhere but I will elaborate.  We plan to o=
verlay
> files to allow vocabulary building, lesson planning and training to be
> separate stages. A single user might most conveniently use a monolithic f=
ile for all stages.
> But in other contexts a student might reference read-only files for diffe=
rent sections
> of the data. For example this overlay stack:
>
> (Words and Grammer) Read - Only, system file
> (Course Plan) Read - Only,  different source, perhaps teacher editable
> (Student Goals and Training Data)  Editable per user.
>   =

>
I share your concern here. It is true that it may be desirable to have a =

dataset with content, voices, images and a dataset with a course plan =

that references the data in the first one. Is this what you have in mind?

Bruno.

_______________________________________________
kde-edu mailing list
kde-edu@mail.kde.org
https://mail.kde.org/mailman/listinfo/kde-edu
[prev in list] [next in list] [prev in thread] [next in thread]