[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xerces-c-dev
Subject:    [jira] Updated: (XERCESC-1839) The line number and column number
From:       "Bin Dai (JIRA)" <xerces-c-dev () xml ! apache ! org>
Date:       2008-11-08 1:04:44
Message-ID: 118382849.1226106284158.JavaMail.jira () brutus
[Download RAW message or body]


     [ https://issues.apache.org/jira/browse/XERCESC-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel \
]

Bin Dai updated XERCESC-1839:
-----------------------------

    Description: 
The xml file (test.xml) below has two french letters (marked here as ','). 
file test.xml:
<?xml version="1.0" encoding="UTF-8" ?>
<Project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<!-- d'�t� 2008 -->
</Project>

When I ran DOMPrint test.xml or SAMPrint test.xml, the fatal errors occured and here \
is the output:

 DOMCount test.xml

Fatal Error at file /home/bdai/xfndry/HEAD/env/xerces-c-3.0.0-x86_64-linux-gcc-3.4/bin/test.xml, \
line 1, char 40  Message: invalid byte 't' at position 2 of a 3-byte sequence

Errors occurred, no output available

Regardless where I move the gilty line, the line number (=1)and column (=40) do not \
change. Debugging through the code, I see that XMLReader.cpp keeps track of the line \
number and column number, but it calls XMLUTF8Transcoder.cpp where it peeks each \
byte. When it realizes that the byte is not an UTF8 code, it throws an exception. \
XMLReader never updates its line and column numbers.

If the error occurs after 4K(hex) bytes, the line will be updated to a new line, but \
will be unchanged inside the second 4K(hex) bytes regardless where the error is.

It would be helpful to report the real line number. 





  was:
The xml file (test.xml) below has two french letters (marked here as ','). 
file test.xml:
<?xml version="1.0" encoding="UTF-8" ?>
<Project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<!-- d'�t� 2008 -->
</Project>

When I ran DOMPrint test.xml or SAMPrint test.xml, the fatal errors occured and here \
is the output:

 DOMCount test.xml

Fatal Error at file /home/bdai/xfndry/HEAD/env/xerces-c-3.0.0-x86_64-linux-gcc-3.4/bin/test.xml, \
line 1, char 40  Message: invalid byte 't' at position 2 of a 3-byte sequence

Errors occurred, no output available

Regardless where I move the gilty line, the line number (=1)and column (=40) do not \
change. Debugging through the code, I see that XMLReader.cpp keeps track of the line \
number and column number, but it calls XMLUTF8Transcoder.cpp where it peeks each \
byte. When it realizes that the byte is not an UTF8 code, it throws an exception. \
XMLReader never updates its line and column numbers.

If the error occurs after 4000X bytes, the line will be updated to a new line, but \
will be unchanged inside the second 4K(hex) bytes regardless where the error is.

It would be helpful to report the real line number. 





     Issue Type: Bug  (was: Improvement)

> The line number and column number issued by fatal error is bogus.
> -----------------------------------------------------------------
> 
> Key: XERCESC-1839
> URL: https://issues.apache.org/jira/browse/XERCESC-1839
> Project: Xerces-C++
> Issue Type: Bug
> Components: DOM, SAX/SAX2, Utilities
> Affects Versions: 2.7.0, 3.0.1
> Environment: I ran sample program on linux 64. But it is not platform/os/compiler \
>                 dependent.
> Reporter: Bin Dai
> Priority: Minor
> Fix For: 2.7.0, 3.0.1
> 
> 
> The xml file (test.xml) below has two french letters (marked here as ','). 
> file test.xml:
> <?xml version="1.0" encoding="UTF-8" ?>
> <Project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
> <!-- d'�t� 2008 -->
> </Project>
> When I ran DOMPrint test.xml or SAMPrint test.xml, the fatal errors occured and \
> here is the output: DOMCount test.xml
> Fatal Error at file \
> /home/bdai/xfndry/HEAD/env/xerces-c-3.0.0-x86_64-linux-gcc-3.4/bin/test.xml, line \
>                 1, char 40
> Message: invalid byte 't' at position 2 of a 3-byte sequence
> Errors occurred, no output available
> Regardless where I move the gilty line, the line number (=1)and column (=40) do not \
> change. Debugging through the code, I see that XMLReader.cpp keeps track of the \
> line number and column number, but it calls XMLUTF8Transcoder.cpp where it peeks \
> each byte. When it realizes that the byte is not an UTF8 code, it throws an \
> exception. XMLReader never updates its line and column numbers. If the error occurs \
> after 4K(hex) bytes, the line will be updated to a new line, but will be unchanged \
> inside the second 4K(hex) bytes regardless where the error is. It would be helpful \
> to report the real line number. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic