'Re: [Wekalist] Descriptors from Model'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       wekalist
Subject:    Re: [Wekalist] Descriptors from Model
From:       Harri Saarikoski <harri.saarikoski () gmail ! com>
Date:       2010-01-28 13:06:36
Message-ID: 15b638f21001280506y1bf3fe9ajb0b9ae99a766497d () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]

2010/1/28 Fabien Tillier <f.tillier@cerep.fr>

>     Hi Weka List.
>
> I have not been able to find the way to get descriptors used in a specific
> model.
>
>
> not all classifiers (e.g. 1r) output the model including (their
> descriptors) from "classify" tab
> (and for those that do, you need to check the "output model" box from "more
> options" in explorer)
>
> to get e.g. 1r's model, try in "select attributes" tab either of the
> following:
> (1) Attribute evaluator: weka.attributeSelection.FilteredAttributeEval -W
> "weka.attributeSelection.OneRAttributeEval -S 1 -F 10 -B 6" -F
> "weka.filters.supervised.instance.SpreadSubsample -M 0.0 -X 0.0 -S 1"
> Search method: weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N
> -1
> (2) Attribute evaluator: weka.attributeSelection.ClassifierSubsetEval -B
> weka.classifiers.rules.OneR -T -H "Click to set hold out or test instances"
> -- -B 6
> Search method: weka.attributeSelection.GreedyStepwise -T
> -1.7976931348623157E308 -N -1
> (just copy these lines to those fields)
>
> (1) outputs the following type of ranking of attributes as evaluated by 1r:
> Ranked attributes:
> 47.917  1 xx
> 41.667  3 yy
> 35.417  2 zz
> -> 1r then selects the top one of the list ("xx")
>
> (2) outputs:
> ...    Merit of best subset found:    0.479
> ...Selected attributes: 1 : 1
> -> selects "xx"
>
> (appears to be some difference between these 1 and 2, most likely owing to
> different search methods...)
>
> use ClassifierSubsetEval (2) to output the descriptor usage information by
> other classifiers than supported by FilteredAttEval (1) with
> attributeSelection class methods
> (1r happens to be one of the supported schemes in that class ->
> OnerAttributeEval,
> but to get e.g. NaiveBayes output use ClassifierSubsetEval (2) )
>
>  Is it always rely on the complete list of descriptors found in the
> original dataset ?
>
>  depends on the classifier which ones it keeps and how it weights them
>
> Harri
>
>  I may not get it, but if some are not used for that model (say a OneR,
> with one attribute, then)
>
>   the model should allow one to only provide the used descriptors. Is this
> the case , and if so, how can e get these descriptors needed by the model ?
>
>
>
> Thanks a lot in advance
>
> Regards,
>
> Fabien
>
>
>
>
>
> Thanks Harri.
>
> However, this was not the answer I was waiting for J
>
> My question was rather how to get the descriptors used for a specific model
> (independently of the algorithm used, OneR was just an easy example), I mean
> the model has been built, you have the original dataset, but not ALL
> descriptors in the original file has been used (because the algorithm is
> selecting some). So, how can you get the list of descriptors used and their
> contribution to the model (from java point of view, not using the interface)
> ?
>

hmm I think this is the question I answered:

use either ClassifierSubsetEval or WrapperSubsetEval
with the chosen algorithm given to them as base classifier
-> output is a parseable format of the required usage / weights of the
descriptors by the algo
(depending on the algorithm whether it reduces descriptors or just
up/downweighs them)

this weka method as specified above can naturally be called from both java
commandline and the gui
(if you're looking for the exact method to call that just outputs the
kept/weighted descriptors
I pass the question to others who may know that bit of code, I just use
what's available)

Harri

>  I know I should have been more precise, sorry for wasting your time.
>
> Thanks !
>
> Fabien
>
>
>
>
>
>
>
> _______________________________________________
> Wekalist mailing list
> Send posts to: Wekalist@list.scms.waikato.ac.nz
> List info and subscription status:
> https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
> List etiquette:
> http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
>
>

-- 
-----------------
Harri M.T. Saarikoski
M.A, PhD graduate student
Helsinki University
Finland

[Attachment #5 (text/html)]

<br><br>
<div class="gmail_quote">2010/1/28 Fabien Tillier <span dir="ltr">&lt;<a \
href="mailto:f.tillier@cerep.fr">f.tillier@cerep.fr</a>&gt;</span><br> <blockquote \
style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; PADDING-LEFT: 1ex" \
class="gmail_quote"> <div lang="FR" vlink="purple" link="blue">
<div>
<div>
<div></div>
<div class="h5">
<div>
<div>
<div>
<p style="MARGIN-LEFT: 35.4pt" class="MsoNormal"><span lang="EN-US">Hi Weka \
List.</span></p> <p style="MARGIN-LEFT: 35.4pt" class="MsoNormal"><span \
lang="EN-US">I have not been able to find the way to get descriptors used in a \
specific model.</span></p></div></div> <div>
<p style="MARGIN-BOTTOM: 12pt; MARGIN-LEFT: 35.4pt; MARGIN-RIGHT: 0cm" \
class="MsoNormal"><br>not all classifie<span lang="EN-US">rs (e.g. 1r) output the \
model including (their descriptors) from &quot;classify&quot; tab</span><br> <span \
lang="EN-US">(and for those that do, you need to check the &quot;output model&quot; \
box from &quot;more options&quot; in explorer)<br><br>to get e.g. 1r&#39;s model, try \
in &quot;select attributes&quot; tab either of the following: <br> (1) Attribute \
evaluator: weka.attributeSelection.FilteredAttributeEval -W \
&quot;weka.attributeSelection.OneRAttributeEval -S 1 -F 10 -B 6&quot; -F \
&quot;weka.filters.supervised.instance.SpreadSubsample -M 0.0 -X 0.0 -S 1&quot;<br> \
Search method: weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N -1<br>(2) \
Attribute evaluator: weka.attributeSelection.ClassifierSubsetEval -B \
weka.classifiers.rules.OneR -T -H &quot;Click to set hold out or test instances&quot; \
-- -B 6<br> Search method: weka.attributeSelection.GreedyStepwise -T \
-1.7976931348623157E308 -N -1<br>(just copy these lines to those fields)<br><br>(1) \
outputs the following type of ranking of attributes as evaluated by 1r:<br>Ranked \
attributes:<br> 47.917  1 xx<br>41.667  3 yy<br>35.417  2 zz<br>-&gt; 1r then selects \
the top one of the list</span> (&quot;xx&quot;)<br><br>(2) outputs: <br>...    Merit \
                of best subset found:    0.479<br>...Selected attributes: 1 : 1<br>
-&gt; selects &quot;xx&quot;<br><br>(appea<span lang="EN-US">rs to be some difference \
between these 1 and 2, most likely owing to different search \
methods...)</span><br><br><span lang="EN-US">use ClassifierSubsetEval (2) to output \
the descriptor usage information</span> by othe<span lang="EN-US">r classifiers than \
supported by FilteredAttEval (1) with attributeSelection class methods<br> (1r \
happens to be one of the supported schemes in that class -&gt; OnerAttributeEval, \
<br>but to get e.g. NaiveBayes output use ClassifierSubsetEval (2) )</span></p></div> \
<blockquote style="BORDER-BOTTOM: medium none; BORDER-LEFT: #cccccc 1pt solid; \
PADDING-BOTTOM: 0cm; PADDING-LEFT: 6pt; PADDING-RIGHT: 0cm; MARGIN-LEFT: 4.8pt; \
BORDER-TOP: medium none; MARGIN-RIGHT: 0cm; BORDER-RIGHT: medium none; PADDING-TOP: \
0cm">

<div>
<div>
<p style="MARGIN-LEFT: 35.4pt" class="MsoNormal"><span lang="EN-US">Is it always rely \
on the complete list of descriptors found in the original dataset \
?</span></p></div></div></blockquote> <div>
<p style="MARGIN-LEFT: 35.4pt" class="MsoNormal">depends on the classifie<span \
lang="EN-US">r which ones </span>it keeps and how it weights them <br> <br>Ha<span \
lang="EN-US">rri</span></p></div> <blockquote style="BORDER-BOTTOM: medium none; \
BORDER-LEFT: #cccccc 1pt solid; PADDING-BOTTOM: 0cm; PADDING-LEFT: 6pt; \
PADDING-RIGHT: 0cm; MARGIN-LEFT: 4.8pt; BORDER-TOP: medium none; MARGIN-RIGHT: 0cm; \
BORDER-RIGHT: medium none; PADDING-TOP: 0cm">

<div>
<div>
<p style="MARGIN-LEFT: 35.4pt" class="MsoNormal"><span lang="EN-US">I may not get it, \
but if some are not used for that model (say a OneR, with one attribute, \
then)</span></p></div></div></blockquote> <blockquote style="BORDER-BOTTOM: medium \
none; BORDER-LEFT: #cccccc 1pt solid; PADDING-BOTTOM: 0cm; PADDING-LEFT: 6pt; \
PADDING-RIGHT: 0cm; MARGIN-LEFT: 4.8pt; BORDER-TOP: medium none; MARGIN-RIGHT: 0cm; \
BORDER-RIGHT: medium none; PADDING-TOP: 0cm">

<div>
<div>
<p style="MARGIN-LEFT: 35.4pt" class="MsoNormal"><span lang="EN-US">the model should \
allow one to only provide the used descriptors. Is this the case , and if so, how can \
e get these descriptors needed by the model ?</span></p>

<p style="MARGIN-LEFT: 35.4pt" class="MsoNormal"><span lang="EN-US"> </span></p>
<p style="MARGIN-LEFT: 35.4pt" class="MsoNormal"><span lang="EN-US">Thanks a lot in \
advance</span></p> <p style="MARGIN-LEFT: 35.4pt" class="MsoNormal"><span \
lang="EN-US">Regards,</span></p> <p style="MARGIN-LEFT: 35.4pt" \
class="MsoNormal"><span lang="EN-US">Fabien</span></p></div></div></blockquote></div> \
<p style="MARGIN-LEFT: 35.4pt" class="MsoNormal"><span style="COLOR: #1f497d" \
lang="EN-US"> </span></p> <p class="MsoNormal"><span style="COLOR: #1f497d; \
FONT-SIZE: 11pt" lang="EN-US"> </span></p></div></div> <p class="MsoNormal"><span \
style="COLOR: #1f497d; FONT-SIZE: 11pt" lang="EN-US">Thanks Harri.</span></p> <p \
class="MsoNormal"><span style="COLOR: #1f497d; FONT-SIZE: 11pt" lang="EN-US">However, \
this was not the answer I was waiting for </span><span style="FONT-FAMILY: Wingdings; \
COLOR: #1f497d; FONT-SIZE: 11pt" lang="EN-US">J</span><span style="COLOR: #1f497d; \
FONT-SIZE: 11pt" lang="EN-US"></span></p>

<p class="MsoNormal"><span style="COLOR: #1f497d; FONT-SIZE: 11pt" lang="EN-US">My \
question was rather how to get the descriptors used for a specific model \
(independently of the algorithm used, OneR was just an easy example), I mean the \
model has been built, you have the original dataset, but not ALL descriptors in the \
original file has been used (because the algorithm is selecting some). So, how can \
you get the list of descriptors used and their contribution to the model (from java \
point of view, not using the interface) ?</span></p> </div></div></blockquote>
<div> </div>
<div>hmm I think this is the question I answered: </div>
<div> </div>
<div>use either ClassifierSubsetEval or WrapperSubsetEval</div>
<div>with the chosen algorithm given to them as base classifier </div>
<div>-&gt; output is a parseable format of the required usage / weights of the \
descriptors by the algo</div> <div>(depending on the algorithm whether it reduces \
descriptors or just up/downweighs them)</div> <div> </div>
<div>this weka method as specified above can naturally be called from both java \
commandline and the gui</div> <div>(if you&#39;re looking for the exact method to \
call that just outputs the kept/weighted descriptors</div> <div>I pass the question \
to others who may know that bit of code, I just use what&#39;s available)</div> <div> \
</div> <div>Harri</div>
<div> </div>
<blockquote style="BORDER-LEFT: #ccc 1px solid; MARGIN: 0px 0px 0px 0.8ex; \
PADDING-LEFT: 1ex" class="gmail_quote"> <div lang="FR" vlink="purple" link="blue">
<div>
<p class="MsoNormal"><span style="COLOR: #1f497d; FONT-SIZE: 11pt" lang="EN-US">I \
know I should have been more precise, sorry for wasting your time.</span></p> <p \
class="MsoNormal"><span style="COLOR: #1f497d; FONT-SIZE: 11pt" lang="EN-US">Thanks \
!</span></p> <p class="MsoNormal"><span style="COLOR: #1f497d; FONT-SIZE: 11pt" \
lang="EN-US">Fabien</span></p> <p class="MsoNormal"><span style="COLOR: #1f497d; \
FONT-SIZE: 11pt" lang="EN-US"> </span></p> <p class="MsoNormal"><span style="COLOR: \
#1f497d; FONT-SIZE: 11pt" lang="EN-US"> </span></p> <p class="MsoNormal"><span \
style="COLOR: #1f497d; FONT-SIZE: 11pt" lang="EN-US"> \
</span></p></div></div><br>_______________________________________________<br>Wekalist \
mailing list<br>Send posts to: <a \
href="mailto:Wekalist@list.scms.waikato.ac.nz">Wekalist@list.scms.waikato.ac.nz</a><br>
 List info and subscription status: <a \
href="https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist" \
target="_blank">https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist</a><br>List \
etiquette: <a href="http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html" \
target="_blank">http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html</a><br>
 <br></blockquote></div><br><br clear="all"><br>-- <br>-----------------<br>Harri \
M.T. Saarikoski<br>M.A, PhD graduate student<br>Helsinki University <br>Finland<br>

_______________________________________________
Wekalist mailing list
Send posts to: Wekalist@list.scms.waikato.ac.nz
List info and subscription status: https://list.scms.waikato.ac.nz/mailman/=
listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.=
html

[prev in list] [next in list] [prev in thread] [next in thread]