[prev in list] [next in list] [prev in thread] [next in thread]
List: jakarta-commons-dev
Subject: [jira] [Updated] (CSV-226) Add CSVParser test case for standard charsets
From: "Anson Schwabecher (JIRA)" <jira () apache ! org>
Date: 2018-05-31 1:28:00
Message-ID: JIRA.13162194.1527296750000.70050.1527730080041 () Atlassian ! JIRA
[Download RAW message or body]
[ https://issues.apache.org/jira/browse/CSV-226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel \
]
Anson Schwabecher updated CSV-226:
----------------------------------
Description:
Hello, I'd like to contribute a CSVParser test suite for standard charsets as defined \
in java.nio.charset.StandardCharsets + UTF-32.
This is a standalone test but is also in support of a fix for CSV-107. It also \
refactors and unifies the testing around your established workaround of inserting \
BOMInputStream ahead of the CSVParser.
It will take a single base UTF-8 encoded file (cstest.csv) and copy it to multiple \
output files (in target dir) with differing character sets, similar to the iconv \
tool. Each file will then be fed into the parser to test all the BOM/NOBOM unicode \
variants. I think a file based approach is still important here rather than just \
encoding a character stream inline as a string, that way if issues develop it's easy \
to inspect the data.
I noticed in the project's pom.xml (rat config) that you are excluding individual \
test resource files by name rather than using a wildcard expression to exclude every \
file in the directory. Is there a reason for this? It's much better if devs do not \
have to maintain this configuration.
{code:language=xml|title=i.e.: switch over to a single exclude expression}
<exclude>src/test/resources/**/*</exclude>
{code}
was:
Hello, I'd like to contribute a CSVParser test suite for standard charsets as defined \
in java.nio.charset.StandardCharsets + UTF-32.
This is a standalone test but is also in support of a fix for CSV-107. It also \
refactors and unifies the testing around your established workaround of inserting \
BOMInputStream ahead of the CSVParser.
It will take a single base UTF-8 encoded file (cstest.csv) and copy it to multiple \
output files (in target dir) with differing character sets, similar to the iconv \
tool. Each file will then be fed into the parser to test all the BOM/NOBOM unicode \
variants. I think a file based approach is still important here rather than just \
encoding a character stream inline as a string, that way if issues develop it's easy \
to inspect the data.
I noticed in the project's pom.xml (rat config) that you are excluding individual \
test resource files by name rather than using a wildcard expression to exclude every \
file in the directory. Is there a reason for this? It's much better if devs do not \
have to maintain this configuration.
i.e.: switch over to a single exclude expression:
{{<exclude>src/test/resources/**/*</exclude>}}
> Add CSVParser test case for standard charsets
> ---------------------------------------------
>
> Key: CSV-226
> URL: https://issues.apache.org/jira/browse/CSV-226
> Project: Commons CSV
> Issue Type: Test
> Components: Parser
> Affects Versions: 1.5
> Reporter: Anson Schwabecher
> Priority: Minor
>
> Hello, I'd like to contribute a CSVParser test suite for standard charsets as \
> defined in java.nio.charset.StandardCharsets + UTF-32. This is a standalone test \
> but is also in support of a fix for CSV-107. It also refactors and unifies the \
> testing around your established workaround of inserting BOMInputStream ahead of the \
> CSVParser. It will take a single base UTF-8 encoded file (cstest.csv) and copy it \
> to multiple output files (in target dir) with differing character sets, similar to \
> the iconv tool. Each file will then be fed into the parser to test all the \
> BOM/NOBOM unicode variants. I think a file based approach is still important here \
> rather than just encoding a character stream inline as a string, that way if issues \
> develop it's easy to inspect the data. I noticed in the project's pom.xml (rat \
> config) that you are excluding individual test resource files by name rather than \
> using a wildcard expression to exclude every file in the directory. Is there a \
> reason for this? It's much better if devs do not have to maintain this \
> configuration. {code:language=xml|title=i.e.: switch over to a single exclude \
> expression} <exclude>src/test/resources/**/*</exclude>
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic