[prev in list] [next in list] [prev in thread] [next in thread] 

List:       jakarta-commons-dev
Subject:    [jira] [Comment Edited] (MATH-1153) Sampling from a 'BetaDistribution' is slow
From:       "Thomas Neidhart (JIRA)" <jira () apache ! org>
Date:       2015-04-30 20:53:06
Message-ID: JIRA.12744980.1412092174000.39592.1430427186642 () Atlassian ! JIRA
[Download RAW message or body]


    [ https://issues.apache.org/jira/browse/MATH-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520038#comment-14520038 \
] 

Thomas Neidhart edited comment on MATH-1153 at 4/30/15 8:52 PM:
----------------------------------------------------------------

After fixing the KS inference tests the respective test failures disappeared as \
expected.

The remaining test failure in testNextInversionDeviate is because the Cheng sampler \
uses a kind of rejection sampling method and will consume more randomness from the \
provided RandomGenerator.

This is a recurring issue, as also for other distributions there are improved \
sampling methods that consume more randomness (see MATH-1220 for the Zipf \
distribution).

This also relates to MATH-1158 as it proposes a different way to create a sampler for \
a distribution. This would probably also allow to provide different samplers using a \
common interface, e.g. the default one uses the inverse transform method while more \
optimized ones could be available which require different assumptions, e.g. wrt the \
RandomGenerator.


was (Author: tn):
After fixing the KS inference tests the respective test failures disappeared as \
expected.

The remaining test failure in testNextInversionDeviate is because the Cheng sampler \
uses a kind of rejection sampling method and will consume more randomness from the \
provided RandomGenerator.

This is a recurring issue, as also for other distributions there are improved \
sampling methods that consume more randomness (see MATH-1220 for the Zipf \
distribution).

This also relates to MATH-1153 as it proposes a different way to create a sampler for \
a distribution. This would probably also allow to provide different samplers using a \
common interface, e.g. the default one uses the inverse transform method while more \
optimized ones could be available which require different assumptions, e.g. wrt the \
RandomGenerator.

> Sampling from a 'BetaDistribution' is slow
> ------------------------------------------
> 
> Key: MATH-1153
> URL: https://issues.apache.org/jira/browse/MATH-1153
> Project: Commons Math
> Issue Type: Improvement
> Reporter: Sergei Lebedev
> Priority: Minor
> Fix For: 4.0
> 
> Attachments: ChengBetaSampler.java, ChengBetaSampler.java, \
> ChengBetaSamplerTest.java 
> 
> Currently the `BetaDistribution#sample` uses inverse CDF method, which is quite \
> slow for sampling-intensive computations. I've implemented a method from the R. C. \
> H. Cheng paper and it seems to work much better. Here's a simple microbenchmark: \
> {code} o.j.b.s.SamplingBenchmark.algorithmBCorBB       1e-3    1000  thrpt        5 \
> 2592200.015    14391.520  ops/s o.j.b.s.SamplingBenchmark.algorithmBCorBB       \
> 1000    1000  thrpt        5  3210800.292    33330.791  ops/s \
> o.j.b.s.SamplingBenchmark.commonsVersion        1e-3    1000  thrpt        5    \
> 31034.225      438.273  ops/s o.j.b.s.SamplingBenchmark.commonsVersion        1000  \
> 1000  thrpt        5    21834.010      433.324  ops/s {code}
> Should I submit a patch?
> R. C. H. Cheng (1978). Generating beta variates with nonintegral shape parameters. \
> Communications of the ACM, 21, 317–322.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic