[prev in list] [next in list] [prev in thread] [next in thread] 

List:       grinder-use
Subject:    Re: [Grinder-use] BeautifulSoup (Re:  fun with re)
From:       jchoot01 <jchoot01 () yahoo ! com>
Date:       2010-06-15 22:44:19
Message-ID: 778556.52891.qm () web53504 ! mail ! re2 ! yahoo ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


I have plenty of resources; I'm trying to speed up the scripting and avoid writing \
and testing regular expressions. The problem seems to be with nested imports, i.e. I \
can import re in __init__ but I cannot import BeautifulSoup there. I also tried \
importing re there along with BeautifulSoup to no avail. 

Now that I think about it I can't recall if I tried initializing it w/ a getter \
outside of my TestRunner... I'll give that another go and see if it's any better. 

     -- John




________________________________
From: Travis Bear <travis_bear@yahoo.com>
To: grinder-use <grinder-use@lists.sourceforge.net>
Sent: Tue, June 15, 2010 4:19:14 PM
Subject: [Grinder-use] BeautifulSoup (Re:  fun with re)


I've hit this issue in the past as well.  Moving the imports got things working for \
me.

FWIW, for performance reasons I eventually gave up on BeautifulSoup and went to the \
HtmlParser library on Sourceforge. (http://htmlparser.sourceforge.net/).  BS runs \
quite fast in cpython, but was a dog-slow CPU hog in Jython.


-Travis






________________________________
From: Philip Aston <philipa@mail.com>
To: grinder-use@lists.sourceforge.net
Sent: Tue, June 15, 2010 11:38:13 AM
Subject: Re: [Grinder-use] fun with re

You're tried moving all the BS imports to TestRunner.__init__?

> I've read the FAQ and tried multiple import locations ( __call__, 
> __init__) for re (in both my script runner and the login_and_blog that 
> it calls) but I cannot get past the ValueError.
> I'm trying to use BeautifulSoup 
> <http://www.crummy.com/software/BeautifulSoup/> to handle my html 
> heavy lifting, which works fine in both python and jython
> 
> Jython 2.5.1 (Release_2_5_1:6813, Sep 26 2009, 13:47:54)
> [Java HotSpot(TM) 64-Bit Server VM (Sun Microsystems Inc.)] on 
> java1.6.0_18-ea
> Type "help", "copyright", "credits" or "license" for more information.
> > > > from BeautifulSoup import BeautifulSoup
> > > > f = open('results_getText_1.html', 'r')
> > > > contents = f.read()
> > > > f.close()
> > > > soup = BeautifulSoup(contents)
> > > > soup.find(id="_33_addEntryButton")['onclick']
> u"location.href = 'the link I'm looking for...';"
> 
> 6/14/10 5:39:43 PM (thread 0 run 0 test 501): Aborted run due to 
> Jython exception: <type 'exceptions.ValueError'>: ('unsupported 
> operand type', 'subpattern') [calling TestRunner]
> <type 'exceptions.ValueError'>: ('unsupported operand type', 'subpattern')
> raise ValueError, ("unsupported operand type", op)
> File "/opt/install/jython/jython2.5.1/Lib/sre_compile.py", 
> line 182, in _compile
> _compile(code, p.data, flags)
> File "/opt/install/jython/jython2.5.1/Lib/sre_compile.py", 
> line 500, in _code
> code = _code(p, flags)
> File "/opt/install/jython/jython2.5.1/Lib/sre_compile.py", 
> line 516, in compile
> p = sre_compile.compile(pattern, flags)
> File "/opt/install/jython/jython2.5.1/Lib/re.py", line 239, in 
> _compile
> p = sre_compile.compile(pattern, flags)
> File "/opt/install/jython/jython2.5.1/Lib/re.py", line 239, in 
> _compile
> return _compile(pattern, 0).sub(repl, string, count)
> File "/opt/install/jython/jython2.5.1/Lib/re.py", line 150, in 
> sub
> convert = lambda(k, val): (k,
> File 
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
>  line 544, in <lambda>
> self.attrs = map(convert, self.attrs)
> File 
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
>  line 548, in __init__
> tag = Tag(self, name, attrs, self.currentTag, self.previous)
> File 
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
>  line 1342, in unknown_starttag
> self.unknown_starttag(tag, attrs)
> File "/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line 
> 333, in finish_starttag
> self.finish_starttag(tag, attrs)
> File "/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line 
> 291, in parse_starttag
> k = self.parse_starttag(i)
> File "/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line 
> 133, in goahead
> self.goahead(0)
> File "/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line 
> 99, in feed
> SGMLParser.feed(self, markup)
> File 
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
>  line 1184, in _feed
> self._feed(isHTML=isHTML)
> File 
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
>  line 1142, in __init__
> self._feed(isHTML=isHTML)
> File 
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
>  line 1142, in __init__
> BeautifulStoneSoup.__init__(self, *args, **kwargs)
> File 
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
>  line 1517, in __init__
> self.soup = BeautifulSoup(result.getText())
> File 
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
>  line 87, in setSoup
> self.setSoup(result)
> File 
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
>  line 164, in page5
> self.page5()
> File 
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
>  line 200, in __call__
> self.testRunner()
> File 
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/runner.py", 
> line 21, in __call__
> 
> 
> any help is greatly appreciated
> 
> -- John
> 

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
grinder-use mailing list
grinder-use@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/grinder-use


      


[Attachment #5 (text/html)]

<html><head><style type="text/css"><!-- DIV {margin:0px;} \
--></style></head><body><div \
style="font-family:arial,helvetica,sans-serif;font-size:10pt"><div \
style="font-family: arial,helvetica,sans-serif; font-size: 10pt;">I have plenty of \
resources; I'm trying to speed up the scripting and avoid writing and testing regular \
expressions. The problem seems to be with nested imports, i.e. I can import re in \
__init__ but I cannot import BeautifulSoup there. I also tried importing re there \
along with BeautifulSoup to no avail. <br><br>Now that I think about it I can't \
recall if I tried initializing it w/ a getter outside of my TestRunner... I'll give \
that another go and see if it's any better. <br><br>&nbsp;&nbsp;&nbsp;&nbsp; -- \
John<br><br><br><br><br><div style="font-family: times new roman,new \
york,times,serif; font-size: 12pt;"><font face="Tahoma" size="2"><hr \
size="1"><b><span style="font-weight: bold;">From:</span></b> Travis Bear  \
&lt;travis_bear@yahoo.com&gt;<br><b><span style="font-weight: bold;">To:</span></b> \
grinder-use &lt;grinder-use@lists.sourceforge.net&gt;<br><b><span style="font-weight: \
bold;">Sent:</span></b> Tue, June 15, 2010 4:19:14 PM<br><b><span style="font-weight: \
bold;">Subject:</span></b> [Grinder-use] BeautifulSoup (Re:  fun with \
re)<br></font><br> <div style="font-family: arial,helvetica,sans-serif; font-size: \
10pt;"><div>I've hit this issue in the past as well.&nbsp; Moving the imports got \
things working for me.<br><br><span><span>FWIW, for performance reasons I eventually \
gave up on BeautifulSoup and went to the HtmlParser library on Sourceforge. (<a \
target="_blank" href="http://htmlparser.sourceforge.net/">http://htmlparser.sourceforge.net/</a>).&nbsp; \
BS runs quite fast in cpython, but was a dog-slow CPU hog in \
Jython.</span></span><br><br><br>-Travis<br><br><br></div><div style="font-family: \
arial,helvetica,sans-serif; font-size: 10pt;"><br><div style="font-family: \
arial,helvetica,sans-serif; font-size: 13px;"><font face="Tahoma" size="2"><hr \
size="1"><b><span style="font-weight: bold;">From:</span></b> Philip Aston \
&lt;philipa@mail.com&gt;<br><b><span style="font-weight: bold;">To:</span></b>  \
grinder-use@lists.sourceforge.net<br><b><span style="font-weight: \
bold;">Sent:</span></b> Tue, June 15, 2010 11:38:13 AM<br><b><span \
style="font-weight: bold;">Subject:</span></b> Re: [Grinder-use] fun with \
re<br></font><br> You're tried moving all the BS imports to \
TestRunner.__init__?<br><br>&gt; I've read the FAQ and tried multiple import \
locations ( __call__, <br>&gt; __init__) for re (in both my script runner and the \
login_and_blog that <br>&gt; it calls) but I cannot get past the ValueError.<br>&gt; \
I'm trying to use BeautifulSoup <br><span><span>&gt; &lt;<a target="_blank" \
href="http://www.crummy.com/software/BeautifulSoup/">http://www.crummy.com/software/BeautifulSoup/</a>&gt; \
to handle my html </span></span><br>&gt; heavy lifting, which works fine in both \
python and jython<br>&gt;<br>&gt; Jython 2.5.1 (Release_2_5_1:6813, Sep 26 2009, \
13:47:54)<br>&gt; [Java HotSpot(TM) 64-Bit Server VM (Sun Microsystems Inc.)] on \
<br>&gt; java1.6.0_18-ea<br>&gt; Type "help", "copyright", "credits" or "license" for \
more information.<br>&gt; &gt;&gt;&gt; from BeautifulSoup import \
BeautifulSoup<br>&gt; &gt;&gt;&gt; f = open('results_getText_1.html', 'r')<br>&gt; \
&gt;&gt;&gt; contents =  f.read()<br>&gt; &gt;&gt;&gt; f.close()<br>&gt; &gt;&gt;&gt; \
soup = BeautifulSoup(contents)<br>&gt; &gt;&gt;&gt; \
soup.find(id="_33_addEntryButton")['onclick']<br>&gt; u"location.href = 'the link I'm \
looking for...';"<br>&gt;<br>&gt; 6/14/10 5:39:43 PM (thread 0 run 0 test 501): \
Aborted run due to <br>&gt; Jython exception: &lt;type 'exceptions.ValueError'&gt;: \
('unsupported <br>&gt; operand type', 'subpattern') [calling TestRunner]<br>&gt; \
&lt;type 'exceptions.ValueError'&gt;: ('unsupported operand type', \
'subpattern')<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  raise ValueError, ("unsupported \
operand type", op)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File \
"/opt/install/jython/jython2.5.1/Lib/sre_compile.py", <br>&gt; line 182, in \
_compile<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  _compile(code, p.data, \
flags)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File \
"/opt/install/jython/jython2.5.1/Lib/sre_compile.py", <br>&gt; line 500, in \
_code<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  code  = _code(p, flags)<br>&gt;&nbsp; \
&nbsp; &nbsp; &nbsp;  File "/opt/install/jython/jython2.5.1/Lib/sre_compile.py", \
<br>&gt; line 516, in compile<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  p = \
sre_compile.compile(pattern, flags)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File \
"/opt/install/jython/jython2.5.1/Lib/re.py", line 239, in <br>&gt; \
_compile<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  p = sre_compile.compile(pattern, \
flags)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File \
"/opt/install/jython/jython2.5.1/Lib/re.py", line 239, in <br>&gt; \
_compile<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  return _compile(pattern, 0).sub(repl, \
string, count)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File \
"/opt/install/jython/jython2.5.1/Lib/re.py", line 150, in <br>&gt; sub<br>&gt;&nbsp; \
&nbsp; &nbsp; &nbsp;  convert = lambda(k, val): (k,<br>&gt;&nbsp; &nbsp; &nbsp; \
&nbsp;  File <br>&gt; \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>&gt; line 544, in  &lt;lambda&gt;<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  self.attrs \
= map(convert, self.attrs)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File <br>&gt; \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>&gt; line 548, in __init__<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  tag = Tag(self, \
name, attrs, self.currentTag, self.previous)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File \
<br>&gt; "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>&gt; line 1342, in unknown_starttag<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  \
self.unknown_starttag(tag, attrs)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File \
"/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line <br>&gt; 333, in \
finish_starttag<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  self.finish_starttag(tag, \
attrs)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File \
"/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line <br>&gt; 291, in \
parse_starttag<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;   k = \
self.parse_starttag(i)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File \
"/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line <br>&gt; 133, in \
goahead<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  self.goahead(0)<br>&gt;&nbsp; &nbsp; \
&nbsp; &nbsp;  File "/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line <br>&gt; \
99, in feed<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  SGMLParser.feed(self, \
markup)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File <br>&gt; \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>&gt; line 1184, in _feed<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  \
self._feed(isHTML=isHTML)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File <br>&gt; \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>&gt; line 1142, in __init__<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  \
self._feed(isHTML=isHTML)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File <br>&gt;  \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>&gt; line 1142, in __init__<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  \
BeautifulStoneSoup.__init__(self, *args, **kwargs)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp; \
File <br>&gt; "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>&gt; line 1517, in __init__<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  self.soup = \
BeautifulSoup(result.getText())<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File <br>&gt; \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
<br>&gt; line 87, in setSoup<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  \
self.setSoup(result)<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File <br>&gt; \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
<br>&gt; line 164, in page5<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  \
self.page5()<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File <br>&gt;  \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
<br>&gt; line 200, in __call__<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  \
self.testRunner()<br>&gt;&nbsp; &nbsp; &nbsp; &nbsp;  File <br>&gt; \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/runner.py", \
<br>&gt; line 21, in __call__<br>&gt;<br>&gt;<br>&gt; any help is greatly \
appreciated<br>&gt;<br>&gt;&nbsp; &nbsp; &nbsp; -- \
John<br>&gt;<br><br>------------------------------------------------------------------------------<br>ThinkGeek \
and WIRED's GeekDad team up for the Ultimate <br>GeekDad Father's Day Giveaway. ONE \
MASSIVE PRIZE to the <br>lucky parental unit.&nbsp; See the prize list and enter to \
win: <br><span><span><a target="_blank" \
href="http://p.sf.net/sfu/thinkgeek-promo">http://p.sf.net/sfu/thinkgeek-promo</a></span></span><br>_______________________________________________<br>grinder-use \
mailing list<br><a rel="nofollow"  ymailto="mailto:grinder-use@lists.sourceforge.net" \
target="_blank" href="mailto:grinder-use@lists.sourceforge.net">grinder-use@lists.sourceforge.net</a><br><a \
rel="nofollow" target="_blank" \
href="https://lists.sourceforge.net/lists/listinfo/grinder-use">https://lists.sourceforge.net/lists/listinfo/grinder-use</a><br></div></div>
 </div><br>

      </div></div>
</div><br>

      </body></html>



------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo

_______________________________________________
grinder-use mailing list
grinder-use@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/grinder-use


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic