[prev in list] [next in list] [prev in thread] [next in thread]
List: poi-dev
Subject: DO NOT REPLY [Bug 52784] SXSSFWorkbook, invalid xml characters, corrupted XLSX
From: bugzilla () apache ! org
Date: 2012-02-28 14:03:47
Message-ID: bug-52784-47293-7809I0l4m9 () https ! issues ! apache ! org/bugzilla/
[Download RAW message or body]
https://issues.apache.org/bugzilla/show_bug.cgi?id=52784
Yegor Kozlov <yegor@dinom.ru> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution| |FIXED
--- Comment #2 from Yegor Kozlov <yegor@dinom.ru> 2012-02-28 14:03:47 UTC ---
Should be fixed in r1294657
Your diagnosis is correct, writing a ISO control character ( < 32) resulted in
a corrupted workbook.
I could easily reproduce it with the following simple code:
Workbook wb = new SXSSFWorkbook();
Sheet sh = wb.createSheet();
Cell cell = sh.createRow(0).createCell(0);
cell.setCellValue("\u0000");
XSSF delegates writing XML to XmlBeans and this framework replaces characters
below 32 with question marks. I changed SXSSF to do so too.
It appears that there are two more special cases where you can't simply write a
char code in XML:
case 1: low and high unicode surrogates: DC00-DFFF and D800-D8FF
case 2: 'not a character' range: FFFE-FFFF
XmlBeans replaces characters from these ranges with question marks, so I fixed
SXSSF to be consistent.
Yegor
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@poi.apache.org
For additional commands, e-mail: dev-help@poi.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic