[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xml-dev
Subject:    Re: [xml-dev] Four fine text-based data formats ... liberate yourself from one (silo) data format
From:       Shaun McCance <shaunm () gnome ! org>
Date:       2013-03-24 15:34:22
Message-ID: 1364139263.2224.609.camel () recto
[Download RAW message or body]

On Sun, 2013-03-24 at 12:54 +0000, Costello, Roger L. wrote:
> Hi Folks,
> 
> Here are four fine text-based data formats. There are all well supported. 
> 
> 1. XML: obviously you know about this data format and its support.
> 
> 2. JSON: data that is in this format can be readily queried and
> manipulated in a JavaScript program, and support for JavaScript is
> growing at a breathtaking rate. From Simon St. Laurent: There are also
> piles of public APIs using JSON.  Programmable Web and similar places
> keep showing growth in JSON-based APIs.  See, for example:
> 
> http://blog.programmableweb.com/2012/12/17/leading-apis-say-bye-xml-in-new-versions/ 
> 
> 3. CSV: data in the form of comma-separated-values (CSV) can be
> readily queried and manipulated in Excel. There are many tools that
> support CSV, here's one from Google:
> 
> http://code.google.com/p/csvfix/ 
> 
> 4. Plain text: of all the data formats, this one is by far the most
> widely supported. Every computer on the planet has at least one text
> editor (probably several). There are many, many powerful tools, such
> as vi and emacs, that can readily query and manipulate plain text
> files. 

What do you mean by "plain text"? XML, JSON, and CSV are all plain
text. Plain text isn't a syntax. It's an assertion that the file
doesn't contain 0x0.

> Shouldn't we define standards - using a particular data format - for
> data exchanges? No! Define standards at the semantic level, not the
> syntax level. Let everyone use their own syntax.

There are times when defining the semantics in a syntax-neutral way
is a good idea. Dublin Core does this. As a result, it gets used in
tons of formats in different syntaxes.

But if everybody gets to use their own syntax, we will never have
interoperability, even if the same semantics are encoded in there.
Somebody has to write the code to parse the syntax and extract the
semantic information.

Also, syntax absolutely informs the semantics. For example, you can't
nest things in CSV or INI files. The best you can do is define some
record as a pointer to another. That's not a standard feature of the
syntax though, so there's more code you get to write.

As with all things, there are trade-offs, and you can't just paint it
with a broad brush of "define semantics, not syntax". If you're dealing
with simple key-value information, maybe it's a good idea. If you want
your semantics embedded in existing host languages, maybe it's a good
idea. But if you want two machines to talk to each other and actually
get something done, you need to define syntax.

--
Shaun



_______________________________________________________________________

XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.

[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic