'Re: [Dev] Algo to use for Logistic regression'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       esb-java-dev
Subject:    Re: [Dev] Algo to use for Logistic regression
From:       Maheshakya Wijewardena <maheshakya () wso2 ! com>
Date:       2015-05-31 18:51:30
Message-ID: CAJqB=jcCh=i619gypHwWGEhSFFesRZSGdHbkM0DiqWUAM6r2dA () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]

I should agree with Upul about adding both if possible. Mini-batch adds the
question of determining the right size for batch size, but finding the
right batch size may greatly improve our results as well as time for
convergence. But still, it can depend heavily on the dataset.

Have you tried with different datasets? Different in terms of size as well
as other statistical properties of features(such as standard deviation,
skewness, etc.)?

On Sun, May 31, 2015 at 10:28 PM, Nirmal Fernando <nirmal@wso2.com> wrote:

> yes.. but from the simple test I did, I felt L-BFGS is faster. Will
> confirm anyway.
>
> On Sun, May 31, 2015 at 10:13 PM, Upul Bandara <upul@wso2.com> wrote:
>
>> Actually, I'm thinking in terms of training time, even for large data
>> sets prediction accuracy of L-BFGS will outperform SGD. But its training
>> time would be considerably bigger than the training time of SGD.
>> On the other hand, SGD model gives a decent prediction accuracy in
>> relatively short period of training time.
>>
>>
>> On Sun, May 31, 2015 at 9:52 PM, Nirmal Fernando <nirmal@wso2.com> wrote:
>>
>>> Thanks Upul. So, are you thinking along the lines of performance? Sure,
>>> I'll run a test.
>>>
>>> On Sun, May 31, 2015 at 9:50 PM, Upul Bandara <upul@wso2.com> wrote:
>>>
>>>> If it is possible, I would like to have both.
>>>>
>>>> L-BFGS converges faster than SGD. But it goes through the entire data
>>>> set before moving from one iteration to the next.
>>>> Whereas, SGD uses a minit-batch of the training data set for
>>>> calculating and updating its gradient.
>>>> Hence, for large data sets SGD is more practical than L-BFGS.
>>>>
>>>> I think we can test this scenario by running these two algorithms
>>>> against a large data set (~ 1GB)
>>>>
>>>> Thanks,
>>>> Upul
>>>>
>>>> On Sun, May 31, 2015 at 8:02 PM, Nirmal Fernando <nirmal@wso2.com>
>>>> wrote:
>>>>
>>>>> One other benefit of switching is, this API supports multi-class
>>>>> classification too. I've tested this API with Iris dataset.
>>>>>
>>>>> On Sun, May 31, 2015 at 7:33 PM, Nirmal Fernando <nirmal@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Currently in ML, we use mini-batch gradient descent algorithm when
>>>>>> running logistic regression. But Spark-mllib recommends L-BFGS over
>>>>>> mini-batch gradient descent for faster convergence [1].
>>>>>>
>>>>>> I tested both the implementation with the same dataset and gained an
>>>>>> improved accuracy in L-BFGS (80% vs 67% for SGD).
>>>>>>
>>>>>> Shall we switch?
>>>>>>
>>>>>> [1]
>>>>>> https://spark.apache.org/docs/latest/mllib-linear-methods.html#logistic-regression
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Thanks & regards,
>>>>>> Nirmal
>>>>>>
>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>>> Mobile: +94715779733
>>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Thanks & regards,
>>>>> Nirmal
>>>>>
>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>>>> Mobile: +94715779733
>>>>> Blog: http://nirmalfdo.blogspot.com/
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Upul Bandara,
>>>> Associate Technical Lead, WSO2, Inc.,
>>>> Mob: +94 715 468 345.
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Thanks & regards,
>>> Nirmal
>>>
>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
>>> Mobile: +94715779733
>>> Blog: http://nirmalfdo.blogspot.com/
>>>
>>>
>>>
>>
>>
>> --
>> Upul Bandara,
>> Associate Technical Lead, WSO2, Inc.,
>> Mob: +94 715 468 345.
>>
>
>
>
> --
>
> Thanks & regards,
> Nirmal
>
> Associate Technical Lead - Data Technologies Team, WSO2 Inc.
> Mobile: +94715779733
> Blog: http://nirmalfdo.blogspot.com/
>
>
>

-- 
Pruthuvi Maheshakya Wijewardena
Software Engineer
WSO2 Lanka (Pvt) Ltd
Email: maheshakya@wso2.com
Mobile: +94711228855

[Attachment #5 (text/html)]

<div dir="ltr"><div>I should agree with Upul about adding both if possible. <span \
href="http://www.quora.com/What-is-the-difference-between-batch-mode-and-mini-batch-in-machine-learning#" \
id="__w2_AfRbKea_toggle_link"><span id="ld_aolrin_106447"><span \
id="ld_inifrx_128521"><span class="">Mini-batch adds the question of determining the \
right size for batch size, but finding the right batch size may greatly improve our \
results as well as time for convergence. But still, it can depend heavily on the \
dataset.<br><br></span></span></span></span></div><span \
href="http://www.quora.com/What-is-the-difference-between-batch-mode-and-mini-batch-in-machine-learning#" \
id="__w2_AfRbKea_toggle_link"><span id="ld_aolrin_106447"><span \
id="ld_inifrx_128521"><span class="">Have you tried with different datasets? \
Different in terms of size as well as other statistical properties of features(such \
as standard deviation, skewness, etc.)?<br></span></span></span></span></div><div \
class="gmail_extra"><br><div class="gmail_quote">On Sun, May 31, 2015 at 10:28 PM, \
Nirmal Fernando <span dir="ltr">&lt;<a href="mailto:nirmal@wso2.com" \
target="_blank">nirmal@wso2.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">yes.. but from the simple test I did, I felt \
L-BFGS is faster. Will confirm anyway.  </div><div class="HOEnZb"><div \
class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Sun, May 31, 2015 \
at 10:13 PM, Upul Bandara <span dir="ltr">&lt;<a href="mailto:upul@wso2.com" \
target="_blank">upul@wso2.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">Actually, I&#39;m thinking in terms of \
training time, even for large data sets prediction accuracy of L-BFGS will outperform \
SGD. But its training time would be considerably bigger than the training time of \
SGD.<div>On the other hand, SGD model gives a decent prediction accuracy in \
relatively short period of training time.</div><div>         \
</div></div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, \
May 31, 2015 at 9:52 PM, Nirmal Fernando <span dir="ltr">&lt;<a \
href="mailto:nirmal@wso2.com" target="_blank">nirmal@wso2.com</a>&gt;</span> \
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex"><div dir="ltr">Thanks Upul. So, are you thinking along \
the lines of performance? Sure, I&#39;ll run a test.</div><div><div><div \
class="gmail_extra"><br><div class="gmail_quote">On Sun, May 31, 2015 at 9:50 PM, \
Upul Bandara <span dir="ltr">&lt;<a href="mailto:upul@wso2.com" \
target="_blank">upul@wso2.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr"><div>If it is possible, I would like to have \
both.</div><div><br></div><div>L-BFGS converges faster than SGD. But it goes through \
the entire data set before moving from one iteration to the next.</div><div>Whereas, \
SGD uses a minit-batch of the training data set for calculating and updating its \
gradient.  </div><div>Hence, for large data sets SGD is more practical than \
L-BFGS.</div><div><br></div><div>I think we can test this scenario by running these \
two algorithms against a large data set (~ \
1GB)</div><div><br></div><div>Thanks,</div><div>Upul</div></div><div \
class="gmail_extra"><div><div><br><div class="gmail_quote">On Sun, May 31, 2015 at \
8:02 PM, Nirmal Fernando <span dir="ltr">&lt;<a href="mailto:nirmal@wso2.com" \
target="_blank">nirmal@wso2.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr">One other benefit of switching is, this API \
supports multi-class classification too. I&#39;ve tested this API with Iris \
dataset.</div><div><div><div class="gmail_extra"><br><div class="gmail_quote">On Sun, \
May 31, 2015 at 7:33 PM, Nirmal Fernando <span dir="ltr">&lt;<a \
href="mailto:nirmal@wso2.com" target="_blank">nirmal@wso2.com</a>&gt;</span> \
wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex"><div dir="ltr">Hi,<div><br></div><div>Currently in ML, \
we use mini-batch gradient descent algorithm when running logistic regression. But \
Spark-mllib recommends  L-BFGS over mini-batch gradient descent for faster \
convergence [1].  </div><div><br></div><div>I tested both the implementation with the \
same dataset and gained an improved accuracy in L-BFGS (80% vs 67% for \
SGD).</div><div><br></div><div>Shall we switch?</div><div><br></div><div>[1] <a \
href="https://spark.apache.org/docs/latest/mllib-linear-methods.html#logistic-regression" \
target="_blank">https://spark.apache.org/docs/latest/mllib-linear-methods.html#logistic-regression</a> \
<span><font color="#888888"><br clear="all"><div><br></div>-- <br><div><div \
dir="ltr"><div><div dir="ltr"><br>Thanks &amp; regards,<br>Nirmal<br><br>Associate \
Technical Lead - Data Technologies Team, WSO2 Inc.<br>Mobile: <a \
href="tel:%2B94715779733" value="+94715779733" \
target="_blank">+94715779733</a><br>Blog: <a href="http://nirmalfdo.blogspot.com/" \
target="_blank">http://nirmalfdo.blogspot.com/</a><br><br><img \
src="http://c.content.wso2.com/signatures/general.png" height="115" \
width="420"><br></div></div></div></div> </font></span></div></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div><div \
dir="ltr"><div><div dir="ltr"><br>Thanks &amp; regards,<br>Nirmal<br><br>Associate \
Technical Lead - Data Technologies Team, WSO2 Inc.<br>Mobile: <a \
href="tel:%2B94715779733" value="+94715779733" \
target="_blank">+94715779733</a><br>Blog: <a href="http://nirmalfdo.blogspot.com/" \
target="_blank">http://nirmalfdo.blogspot.com/</a><br><br><img \
src="http://c.content.wso2.com/signatures/general.png" height="115" \
width="420"><br></div></div></div></div> </div>
</div></div></blockquote></div><br><br \
clear="all"><div><br></div></div></div><span><font color="#888888">-- <br><div><div \
dir="ltr"><div>Upul Bandara,<br></div><div>Associate Technical Lead, WSO2, \
Inc.,</div><div>Mob: <a href="tel:%2B94%20715%20468%20345" value="+94715468345" \
target="_blank">+94 715 468 345</a>.</div></div></div> </font></span></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div><div \
dir="ltr"><div><div dir="ltr"><br>Thanks &amp; regards,<br>Nirmal<br><br>Associate \
Technical Lead - Data Technologies Team, WSO2 Inc.<br>Mobile: <a \
href="tel:%2B94715779733" value="+94715779733" \
target="_blank">+94715779733</a><br>Blog: <a href="http://nirmalfdo.blogspot.com/" \
target="_blank">http://nirmalfdo.blogspot.com/</a><br><br><img \
src="http://c.content.wso2.com/signatures/general.png" height="115" \
width="420"><br></div></div></div></div> </div>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div><div \
dir="ltr"><div>Upul Bandara,<br></div><div>Associate Technical Lead, WSO2, \
Inc.,</div><div>Mob: <a href="tel:%2B94%20715%20468%20345" value="+94715468345" \
target="_blank">+94 715 468 345</a>.</div></div></div> </div>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div><div \
dir="ltr"><div><div dir="ltr"><br>Thanks &amp; regards,<br>Nirmal<br><br>Associate \
Technical Lead - Data Technologies Team, WSO2 Inc.<br>Mobile: <a \
href="tel:%2B94715779733" value="+94715779733" \
target="_blank">+94715779733</a><br>Blog: <a href="http://nirmalfdo.blogspot.com/" \
target="_blank">http://nirmalfdo.blogspot.com/</a><br><br><img \
src="http://c.content.wso2.com/signatures/general.png" height="115" \
width="420"><br></div></div></div></div> </div>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div \
class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div \
dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><span \
style="font-family:monospace,monospace"><span style="color:rgb(153,153,153)">Pruthuvi \
Maheshakya Wijewardena<br></span></span></div><div><span \
style="font-family:monospace,monospace"><span style="color:rgb(153,153,153)">Software \
Engineer<br></span></span></div><div><span \
style="font-family:monospace,monospace"><span style="color:rgb(153,153,153)">WSO2 \
Lanka (Pvt) Ltd<br></span></span></div><div><span \
style="font-family:monospace,monospace"><span style="color:rgb(153,153,153)">Email: \
<a href="mailto:maheshakya@wso2.com" \
target="_blank">maheshakya@wso2.com</a><br></span></span></div><div><span \
style="color:rgb(153,153,153)"><span style="font-family:monospace,monospace">Mobile: \
+94711228855</span><i><b><span \
style="font-family:georgia,serif"><br></span></b></i></span></div><div \
dir="ltr"><br></div><div><br></div></div></div></div></div></div></div></div></div></div>
 </div>

_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

[prev in list] [next in list] [prev in thread] [next in thread]