[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xerces-c-dev
Subject:    RE: Validation Problem with Foreign Characters
From:       White Daniel E CONT DLVA <WhiteDE () NSWC ! NAVY ! MIL>
Date:       2004-10-25 14:31:47
Message-ID: DD260610C20AD211BC9400805F9FCCA80C53BE2B () nswcdlvaex05 ! nswc ! navy ! mil
[Download RAW message or body]

Actually, I just found the answer I was looking for:

I read in the XML off of a socket and into a character buffer and then into
a MemBufInputSource object.
I found the "setEncoding" method to that object, and added the line:

memBufIS->setEncoding ( XMLString::transcode( "iso-8859-1" ) ) ;

Now, the foreign character in the generic xs:string field validates, but the
other one fails.
That's OK because it is validating against a pattern of [A-Z]

Thanks for the assistance.  Maybe this thread will help someone else.

-----Original Message-----
From: Alberto Massari [mailto:amassari@progress.com]
Sent: Monday, October 25, 2004 10:05 AM
To: xerces-c-dev@xml.apache.org
Subject: RE: Validation Problem with Foreign Characters


If you get a UTFDataFormatException, you are using the UTF-8 transcoder, 
and not the iso-8859 one. Have you tried running the sample DOMPrint on the 
XML file? If it fails, you should attach a test XML to your mail so that we 
can reproduce the problem.

Alberto

At 09.52 25/10/2004 -0400, White Daniel E CONT DLVA wrote:
>Yes it does, and I have some more info:
>
>I got the error "Invalid or incomplete multibyte or wide character" when it
>was in a field that was validated by a pattern of [A-Z]
>
>I tried it again in a field of just xs:string -- a description field -- and
>I got this error:
>
>Fatal Error: Type: UTFDataFormatException    Message: invalid byte 2 () of
a
>2-byte sequence
>
>-----Original Message-----
>From: Alberto Massari [mailto:amassari@progress.com]
>Sent: Monday, October 25, 2004 9:42 AM
>To: xerces-c-dev@xml.apache.org
>Subject: Re: Validation Problem with Foreign Characters
>
>
>At 09.36 25/10/2004 -0400, White Daniel E CONT DLVA wrote:
> >I am trying to accept XML with some foreign characters in it:
> >
> >Like this: Hållø Thære.  These are 8-bit, iso-8859-1 characters, ( e5,
e6,
> >and f8 )
> >
> >The XercesDOMParser's validator throws an exception that says "Invalid or
> >incomplete multibyte or wide character"
> >
> >Any clues out there ?
>
>Does your XML file start with <?xml version="1.0" encoding="iso-8859-1"?> ?
>
>Alberto
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic