[prev in list] [next in list] [prev in thread] [next in thread] 

List:       tomcat-user
Subject:    RE:  Problem with file upload corruption.
From:       "Richard Mixon (qwest)" <rnmixon () qwest ! net>
Date:       2005-08-31 5:33:36
Message-ID: 20050831053534.1380A10FB2B0 () asf ! osuosl ! org
[Download RAW message or body]

OK, my goof. In my frustration and hurry I did not read the RFC well
enough. After re-reading the RFC yet one more time, it finally became
clear. In case it helps anyone else, I'm posting what I learned here. 

Basically the browser is allowed/expected to set the encoding type.
Under section 3.3 of RFC 1867:

  3.3 use of multipart/form-data

     ... Each part should be labelled with an appropriate
     content-type if the media type is known (e.g., inferred from the
file
     extension or operating system typing information) or as
     application/octet-stream. 
     ...

I kept expecting there to be some way to designate the file as either
binary (don't change a thing, just upload it) or text (handle CRLF's and
character set translation) - much like one does with an FTP tranfer.

So I guess the best answer to my original dilema is to write a utility
method/filter that inspects the uploaded HTML file for invalid
characters and notifies the user if any are found.

Hope this helps someone else - Richard

-----Original Message-----
From: Richard Mixon (qwest) [mailto:rnmixon@qwest.net]
Sent: Tuesday, August 30, 2005 7:26 PM
To: 'tomcat-user@jakarta.apache.org'
Subject: Problem with file upload corruption.

Sorry to kick this up. I know it’s a slightly obscure topic, and I'm
hoping it may have rolled by someone knowledgable.

I just tried using the Jakarta Commons File Upload instead of the
Oreilly MultiPartRequest. I get the same results.

No matter what kind of file I try uploading - it treats it as text: from
a Windows machine all occurrences of 0x0D0A are converted to 0x0A. So
contrary to the RFC saying it is a binary file upload, it appears to be
doing a text upload - or I am really missing out on something.

Thank you - Richard

-----Original Message-----
From: rnmixon@qwest.net [mailto:rnmixon@qwest.net]
Sent: Monday, August 29, 2005 7:03 PM
To: tomcat-user@jakarta.apache.org
Subject: Problem with file upload corruption.

We have a JSP/servlet combo that uses the OReilly MultiPartRequest to
upload a users HTML template for our application. Invariably they end up
with some unusual characters in their template - sometimes from pasting
in text from MS Word or other similar application.

For some reason a single character (e.g. x092 a "right single quotation
mark") is turned into multiple special characters after it is uploaded.
When we download it  we use
  response.setContentType("application/octet-stream");
and the mangled file downloads "fine" (i.e. without change)>

Here is an example - the right single quotation marka backward single
quote comes right after the paragraph tag (<p>).

BEFORE upload
<html>
...
<p>’Some text.
</html>

AFTER upload
<html>
...
<p>�Some text.
</html>

I have read the file upload RFC 1867 until I'm blue in the face, and
Googled on and off the servlet.com site. There were reported binary
upload problems using the warp connector to connect Tomcat 4.0 and
Apache. But we are using Tomcat 4.1.18.

Any ideas or suggestion are appreciated.

Thank you - Richard


--------------------------------------------------------------------
mail2web - Check your email from the web at http://mail2web.com/ .





---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic