[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xerces-j-dev
Subject:    SV: SAX parser anomaly
From:       Martin_Gülich <martin.gulich () foi ! se>
Date:       2004-09-22 7:12:55
Message-ID: 000401c4a073$9542e950$293ee396 () win ! foi ! se
[Download RAW message or body]

Hello!
This is a trivial solution but if you open a file in Emacs (for windows) and
type some odd letter, like ü,å,ä,ö etc. and try to save it, Emacs will
prompt if you want to save it in ISO-8859-1 (=Cp1252 on windows). Just hit
enter and the file is saved in ISO-8859-1. Then you can remove the odd
letter...

/Martin


Old Message: 

hi Daryl,

thanks for your help.  the VMS xml file declares this at the beginning:

<?xml version="1.0" encoding="ISO-8859-1" ?>

the VMS system is really old, how can i verify what encoding the file
it creates really is?... 

is there a way to take any file (not know what encoding it is) and
convert it into an encoding that i specify?

my current java code is the following:

mySAXFactory = SAXParserFactory.newInstance();
mySAXParser = mySAXFactory.newSAXParser();
mySAXParser.parse(new File("VMS_export_data.xml"), this);

i've been doing some research and it seems i can use InputStreamReader
to convert a file into Unicode.  BUT you need to tell it what encoding
the file is currently in (which i don't know)...

any ideas?

woodchuck



--- "Conley, Daryl" <Daryl.Conley@ccra-adrc.gc.ca> wrote:

> Ensure that the Character encoding is what it says it is at the top
> of the
> XML file.  VMS may be using a different code page.  When you open it
> in
> notepad Windows may be doing some converion on it when you save the
> file.
> 
> daryl
> 
> -----Original Message-----
> From: Woodchuck [mailto:woodchuck_5@yahoo.com]
> Sent: September 21, 2004 11:57 AM
> To: xerces
> Subject: SAX parser anomaly
> 
> 
> hihi all,
> 
> i have no clue why this is happening.
> 
> whenever i try to parse an xml file (generated by legacy VMS system)
> i
> get SAXExceptions thrown.
> 
> HOWEVER, if i copy the contents of that same xml file and paste it
> into
> a new xml file (using notepad), this new xml file parses no problems
> whatsoever.
> 
> i have analyzed the legacy generated xml file to death (file compare
> utilities, hex editor, line by line diff utilities) but as far as i
> can
> tell they are the same.
> 
> can anyone shed light on this obscure anomaly?  i want to save what
> little sanity i have left...
> 
> please and thanks,
> woodchuck
> 
> (should i post to the xerces developer list?)
> 
> 
> 		
> _______________________________
> Do you Yahoo!?
> Declare Yourself - Register online to vote today!
> http://vote.yahoo.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
> 
> 



		
_______________________________
Do you Yahoo!?
Declare Yourself - Register online to vote today!
http://vote.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic