'[Wekalist] Re: SMO classifier MacOS Intel vs. ARM architecture'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       wekalist
Subject:    [Wekalist] Re: SMO classifier MacOS Intel vs. ARM architecture
From:       Eibe Frank <eibe.frank () waikato ! ac ! nz>
Date:       2023-02-12 5:46:54
Message-ID: CADehzLVwW8-WT4wgCfxxUG6KEz_ZwOms1Ko2AWHwFEi2gt6AtA () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


SMO in WEKA is a pure Java implementation, and the Java bytecode for WEKA
3.8.6 on both platforms is identical (in fact, for any given release of
WEKA, the same weka.jar file is used on all platforms). Thus, differences
are due to the Java virtual machine or the actual hardware. (Note that the
two Zulu OpenJDK 17 virtual machines on the two platforms used for WEKA
3.8.6 have the same release numbers, but the machine code they emit will
obviously be different.)

I have reproduced a version of your XOR data (which is attached) and also
get a fairly large difference on this data when running a 10-fold CV in the
Explorer, with SMO using an RBFKernel and C=500 (other settings left at
their default values). I get 67.5% accuracy on a Mac Mini with an M1 ARM
chip and 66% accuracy on a Windows machine with an Intel-based chip.

It turns out that both results are actually slightly lower than they should
be. If you reduce the value of the epsilon parameter in SMO to 10^-14, both
platforms will give an accuracy of 68% in the same experiment.

I could not reproduce the problem in the BoundaryVisualizer. Both platforms
produce very similar (possibly slightly different) plots.

Cheers,
Eibe

PS:

Using LibSVM via the LibSVM package in WEKA, I get 67.5% accuracy on both
platforms, using the configuration

weka.classifiers.functions.LibSVM -S 0 -K 2 -D 3 -G 0.01 -R 0.0 -N 0.5 -M
40.0 -C 500.0 -E 0.001 -P 0.1 -H -Z -model / -seed 1

Using the standard SVM in R from WEKA, through MLRClassifier from the
RPlugin package, trying to make the settings as consistent as possible, I
get 66% accuracy on both platforms using the configuration

weka.classifiers.meta.FilteredClassifier -F
"weka.filters.unsupervised.attribute.Normalize" -S 1 -W
weka.classifiers.mlr.MLRClassifier -- -learner classif.svm -params
gamma=0.01,cost=500,shrinking=FALSE,scale=FALSE




On Sat, 11 Feb 2023 at 09:17, <mjolly@pingry.org> wrote:

> Hello all,
>
> I had my students run an experiment using the SMO classifier on an XOR
> dataset and we are getting strange results.
> I and 2 of my students have a Windows computer, 2 students have a Mac with
> Intel processor, and 2 students have a Mac with ARM processors.
> The windows on Windows and Mac Intel are identical. The results on Mac ARM
> are different. We ran an experiment with the following classifiers:
> C = 10, Polynomial kernel, exponent = 6
> C = 100, Polynomial kernel, exponent  = 4
> C = 0.1, Polynomial kernel, exponent  = 10
> C = 500, RBF kernel, gamma = 0.01
> C = 50, RBF kernel, gamma = 10
> Then we ran a corrected t-test on the percent_correct measure.
> On Windows and Mac Intel, we get 81.65, 89.75, 64.00, 62.25, 95.85 for the
> 5 classifiers.
> On Mac ARM, we get 81.60, 89.75, 64.05, 63.25, 95.80 for the 5 classifiers.
> I understand that the difference might be due to the different processors.
>
> However, the visualization of the boundaries for the worst classifier is
> very different. I asked the students to use the boundary visualizer for the
> worst classifier.
> I don't know how to add pictures to this post, so I am sharing 2
> screenshots on my google drive.
> Mac Intel screenshot:
> https://drive.google.com/file/d/1_cKuLj3oD4eD_k7fF2ZyxDkBRk66LSqE/view?usp=sharing
> Mac ARM screenshot:
> https://drive.google.com/file/d/1zKxPoUMAVGmVSaxMEqVuwY6Jv_0-k7wT/view?usp=sharing
> As you can see, the Mac ARM screenshot is completely wrong. It shows a
> classification that is much better than 63.25%.
>
> The students experimented a little bit and compared results and different
> computers and it seems that it has to do with the low value (around 0.01)
> of gamma for the RBF filter. Can somebody look into the code and see what
> is going on.
> Thanks in advance!!
>
> Marie-Pierre
> _______________________________________________
> Wekalist mailing list -- wekalist@list.waikato.ac.nz
> Send posts to wekalist@list.waikato.ac.nz
> To unsubscribe send an email to wekalist-leave@list.waikato.ac.nz
> To subscribe, unsubscribe, etc., visit
> https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz
> List etiquette:
> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>

[Attachment #5 (text/html)]

<div dir="ltr"><div dir="ltr"><div>SMO in WEKA is a pure Java implementation, and the \
Java bytecode for WEKA 3.8.6 on both platforms is identical (in fact, for any given \
release of WEKA, the same weka.jar file is used on all platforms). Thus, differences \
are due to the Java virtual machine or the actual hardware. (Note that the two Zulu \
OpenJDK 17 virtual machines on the two platforms  used for WEKA 3.8.6 have the same \
release numbers, but the machine code they emit will  obviously be different.) \
</div><div><br></div><div>I have reproduced a version of your XOR data (which is \
attached) and also get a fairly large difference on this data when running a 10-fold \
CV in the Explorer, with SMO using an RBFKernel and C=500 (other settings left at \
their default values). I get 67.5% accuracy on a Mac Mini with an M1 ARM chip and 66% \
accuracy on a Windows machine with an Intel-based \
chip.</div><div><div><br></div><div>It turns out that both results are actually \
slightly  lower than they should be. If you reduce the value of the epsilon 
parameter in SMO to 10^-14, both platforms will give an accuracy of 68% in the same \
experiment. <br></div><div><br></div><div>I could not reproduce the problem in the \
BoundaryVisualizer. Both platforms produce very similar (possibly slightly different) \
plots.</div><div><br></div><div>Cheers,</div><div>Eibe</div><div><br></div><div>PS: \
<br></div><div><br></div><div>Using LibSVM via the LibSVM package in WEKA, I get \
67.5% accuracy on both platforms, using the \
configuration</div><div><br></div><div>weka.classifiers.functions.LibSVM -S 0 -K 2 -D \
3 -G 0.01 -R 0.0 -N 0.5 -M 40.0 -C 500.0 -E 0.001 -P 0.1 -H -Z -model / -seed \
1</div><div><br></div><div>Using the standard SVM in R from WEKA, through \
MLRClassifier from the RPlugin package, trying to make the settings as consistent as \
possible, I get 66% accuracy on both platforms using the \
configuration</div><div><br></div><div>weka.classifiers.meta.FilteredClassifier -F \
&quot;weka.filters.unsupervised.attribute.Normalize&quot; -S 1 -W \
weka.classifiers.mlr.MLRClassifier -- -learner classif.svm -params \
gamma=0.01,cost=500,shrinking=FALSE,scale=FALSE</div><div><br></div><div><br></div><br></div></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, 11 Feb 2023 at 09:17, \
&lt;<a href="mailto:mjolly@pingry.org">mjolly@pingry.org</a>&gt; \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello all,<br> <br>
I had my students run an experiment using the SMO classifier on an XOR dataset and we \
are getting strange results.<br> I and 2 of my students have a Windows computer, 2 \
students have a Mac with Intel processor, and 2 students have a Mac with ARM \
processors.<br> The windows on Windows and Mac Intel are identical. The results on \
Mac ARM are different. We ran an experiment with the following classifiers:<br> C = \
10, Polynomial kernel, exponent = 6<br> C = 100, Polynomial kernel, exponent   = \
4<br> C = 0.1, Polynomial kernel, exponent   = 10<br>
C = 500, RBF kernel, gamma = 0.01<br>
C = 50, RBF kernel, gamma = 10<br>
Then we ran a corrected t-test on the percent_correct measure. <br>
On Windows and Mac Intel, we get 81.65, 89.75, 64.00, 62.25, 95.85 for the 5 \
classifiers.<br> On Mac ARM, we get 81.60, 89.75, 64.05, 63.25, 95.80 for the 5 \
classifiers.<br> I understand that the difference might be due to the different \
processors. <br> <br>
However, the visualization of the boundaries for the worst classifier is very \
different. I asked the students to use the boundary visualizer for the worst \
classifier.<br> I don&#39;t know how to add pictures to this post, so I am sharing 2 \
screenshots on my google drive.<br> Mac Intel screenshot: <a \
href="https://drive.google.com/file/d/1_cKuLj3oD4eD_k7fF2ZyxDkBRk66LSqE/view?usp=sharing" \
rel="noreferrer" target="_blank">https://drive.google.com/file/d/1_cKuLj3oD4eD_k7fF2ZyxDkBRk66LSqE/view?usp=sharing</a><br>
 Mac ARM screenshot: <a \
href="https://drive.google.com/file/d/1zKxPoUMAVGmVSaxMEqVuwY6Jv_0-k7wT/view?usp=sharing" \
rel="noreferrer" target="_blank">https://drive.google.com/file/d/1zKxPoUMAVGmVSaxMEqVuwY6Jv_0-k7wT/view?usp=sharing</a><br>
 As you can see, the Mac ARM screenshot is completely wrong. It shows a \
classification that is much better than 63.25%.<br> <br>
The students experimented a little bit and compared results and different computers \
and it seems that it has to do with the low value (around 0.01) of gamma for the RBF \
filter. Can somebody look into the code and see what is going on.<br> Thanks in \
advance!!<br> <br>
Marie-Pierre<br>
_______________________________________________<br>
Wekalist mailing list -- <a href="mailto:wekalist@list.waikato.ac.nz" \
target="_blank">wekalist@list.waikato.ac.nz</a><br> Send posts to <a \
href="mailto:wekalist@list.waikato.ac.nz" \
target="_blank">wekalist@list.waikato.ac.nz</a><br> To unsubscribe send an email to \
<a href="mailto:wekalist-leave@list.waikato.ac.nz" \
target="_blank">wekalist-leave@list.waikato.ac.nz</a><br> To subscribe, unsubscribe, \
etc., visit <a href="https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz" \
rel="noreferrer" target="_blank">https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz</a><br>
 List etiquette: <a href="http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html" \
rel="noreferrer" target="_blank">http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html</a><br>
 </blockquote></div></div>


["XOR_data.arff" (application/octet-stream)]

_______________________________________________
Wekalist mailing list -- wekalist@list.waikato.ac.nz
Send posts to wekalist@list.waikato.ac.nz
To unsubscribe send an email to wekalist-leave@list.waikato.ac.nz
To subscribe, unsubscribe, etc., visit \
https://list.waikato.ac.nz/postorius/lists/wekalist.list.waikato.ac.nz List \
etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic