[prev in list] [next in list] [prev in thread] [next in thread]
List: nutch-developers
Subject: [Nutch-dev] [jira] Commented: (NUTCH-530) Add a combiner to improve
From: "Emmanuel Joke (JIRA)" <jira () apache ! org>
Date: 2007-07-31 11:13:52
Message-ID: 8345559.1185880432976.JavaMail.jira () brutus
[Download RAW message or body]
[ https://issues.apache.org/jira/browse/NUTCH-530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516675 \
]
Emmanuel Joke commented on NUTCH-530:
-------------------------------------
Actually I don't re-use CrawlDbReducer, I've define a new class as Combiner. This \
class aggregates only the score of all CrawlDatum with the status "Linked" into one \
CrawlDatum. Its just a part of what CrawlDbReducer do. I've done few test in \
different case and it has no impact on the current score.
> Add a combiner to improve performance on updatedb
> -------------------------------------------------
>
> Key: NUTCH-530
> URL: https://issues.apache.org/jira/browse/NUTCH-530
> Project: Nutch
> Issue Type: Improvement
> Environment: java 1.6
> Reporter: Emmanuel Joke
> Assignee: Emmanuel Joke
> Fix For: 1.0.0
>
> Attachments: NUTCH-530.patch
>
>
> We have a lot of similar links with status "linked" generated at the ouput of the \
> map task when we try to update the crawldb based on the segment fetched. We can use \
> a combiner to improve the performance.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic