[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-user
Subject:    RE: FileDocument.java  -- Out of scope for Lucene Users list --
From:       "Philippe Laflamme" <plaflamme () konova ! com>
Date:       2003-11-28 22:33:06
[Download RAW message or body]

A few pointers that might help you out, but this is totally of topic for the
Lucene Users list. This is a Java related problem, if you're new to Java,
please look to other mailing lists for some help...

...
1 Reader reader = new BufferedReader(new InputStreamReader(is));
2 char [] buf = new char[512];
3 reader.read(buf);
4
5 String a = new String(buf, 0, 510);
...

Line 1: if you know the character set of the file you are reading, you
should provide it to the InputStreamReader constructor, otherwise, it will
use the VM's default character set. This might not be what you want if
you're reading foreign language documents.

Line 3 and 5: you should get the number of characters read on line 3 and
provide this value on line 5 when constructing your String. Otherwise,
you'll be using whatever garbage is present in your character buffer to
construct your String object.

For example, if the read() method actually read 10 characters, your string
will contain 500 bogus characters that might look like what you're seing
now...

Like I said, this is totally out of scope for the Lucene Users list. I think
you should look somewhere else for further assistance with this problem.

Regards,
Phil

> -----Original Message-----
> From: Tun Lin [mailto:chentun@singnet.com.sg]
> Sent: November 28, 2003 10:59
> To: Lucene user list
> Subject: FileDocument.java
>
>
> Hi Lucene experts,
> Can you help on this?
> I have included the following code in FileDocument to print out
> the summary but
> I have funny output like:
> The result after searching, the summary is displayed as below:
> ÐÏࡱá>þÿ
> UWþÿÿÿTÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
> ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
> ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
> ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
> ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
> ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
> ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
> ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
> ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
> ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
> ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
> FileInputStream is = new FileInputStream(f);
>     try
> 	{
>     Reader reader = new BufferedReader(new InputStreamReader(is));
>     char [] buf = new char[512];
> 	reader.read(buf);
>
> 	String a = new String(buf, 0, 510);
>     doc.add(Field.Text("contents", reader));
>     doc.add(Field.UnIndexed("summary", a ) );// return the document
>     }catch (IOException e)
> 	{
> 		e.printStackTrace();
> 	}
>
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic