[prev in list] [next in list] [prev in thread] [next in thread]
List: grinder-use
Subject: Re: [Grinder-use] BeautifulSoup (Re: fun with re)
From: jchoot01 <jchoot01 () yahoo ! com>
Date: 2010-06-15 22:44:19
Message-ID: 778556.52891.qm () web53504 ! mail ! re2 ! yahoo ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
I have plenty of resources; I'm trying to speed up the scripting and avoid writing \
and testing regular expressions. The problem seems to be with nested imports, i.e. I \
can import re in __init__ but I cannot import BeautifulSoup there. I also tried \
importing re there along with BeautifulSoup to no avail.
Now that I think about it I can't recall if I tried initializing it w/ a getter \
outside of my TestRunner... I'll give that another go and see if it's any better.
-- John
________________________________
From: Travis Bear <travis_bear@yahoo.com>
To: grinder-use <grinder-use@lists.sourceforge.net>
Sent: Tue, June 15, 2010 4:19:14 PM
Subject: [Grinder-use] BeautifulSoup (Re: fun with re)
I've hit this issue in the past as well. Moving the imports got things working for \
me.
FWIW, for performance reasons I eventually gave up on BeautifulSoup and went to the \
HtmlParser library on Sourceforge. (http://htmlparser.sourceforge.net/). BS runs \
quite fast in cpython, but was a dog-slow CPU hog in Jython.
-Travis
________________________________
From: Philip Aston <philipa@mail.com>
To: grinder-use@lists.sourceforge.net
Sent: Tue, June 15, 2010 11:38:13 AM
Subject: Re: [Grinder-use] fun with re
You're tried moving all the BS imports to TestRunner.__init__?
> I've read the FAQ and tried multiple import locations ( __call__,
> __init__) for re (in both my script runner and the login_and_blog that
> it calls) but I cannot get past the ValueError.
> I'm trying to use BeautifulSoup
> <http://www.crummy.com/software/BeautifulSoup/> to handle my html
> heavy lifting, which works fine in both python and jython
>
> Jython 2.5.1 (Release_2_5_1:6813, Sep 26 2009, 13:47:54)
> [Java HotSpot(TM) 64-Bit Server VM (Sun Microsystems Inc.)] on
> java1.6.0_18-ea
> Type "help", "copyright", "credits" or "license" for more information.
> > > > from BeautifulSoup import BeautifulSoup
> > > > f = open('results_getText_1.html', 'r')
> > > > contents = f.read()
> > > > f.close()
> > > > soup = BeautifulSoup(contents)
> > > > soup.find(id="_33_addEntryButton")['onclick']
> u"location.href = 'the link I'm looking for...';"
>
> 6/14/10 5:39:43 PM (thread 0 run 0 test 501): Aborted run due to
> Jython exception: <type 'exceptions.ValueError'>: ('unsupported
> operand type', 'subpattern') [calling TestRunner]
> <type 'exceptions.ValueError'>: ('unsupported operand type', 'subpattern')
> raise ValueError, ("unsupported operand type", op)
> File "/opt/install/jython/jython2.5.1/Lib/sre_compile.py",
> line 182, in _compile
> _compile(code, p.data, flags)
> File "/opt/install/jython/jython2.5.1/Lib/sre_compile.py",
> line 500, in _code
> code = _code(p, flags)
> File "/opt/install/jython/jython2.5.1/Lib/sre_compile.py",
> line 516, in compile
> p = sre_compile.compile(pattern, flags)
> File "/opt/install/jython/jython2.5.1/Lib/re.py", line 239, in
> _compile
> p = sre_compile.compile(pattern, flags)
> File "/opt/install/jython/jython2.5.1/Lib/re.py", line 239, in
> _compile
> return _compile(pattern, 0).sub(repl, string, count)
> File "/opt/install/jython/jython2.5.1/Lib/re.py", line 150, in
> sub
> convert = lambda(k, val): (k,
> File
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
> line 544, in <lambda>
> self.attrs = map(convert, self.attrs)
> File
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
> line 548, in __init__
> tag = Tag(self, name, attrs, self.currentTag, self.previous)
> File
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
> line 1342, in unknown_starttag
> self.unknown_starttag(tag, attrs)
> File "/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line
> 333, in finish_starttag
> self.finish_starttag(tag, attrs)
> File "/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line
> 291, in parse_starttag
> k = self.parse_starttag(i)
> File "/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line
> 133, in goahead
> self.goahead(0)
> File "/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line
> 99, in feed
> SGMLParser.feed(self, markup)
> File
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
> line 1184, in _feed
> self._feed(isHTML=isHTML)
> File
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
> line 1142, in __init__
> self._feed(isHTML=isHTML)
> File
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
> line 1142, in __init__
> BeautifulStoneSoup.__init__(self, *args, **kwargs)
> File
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
> line 1517, in __init__
> self.soup = BeautifulSoup(result.getText())
> File
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
> line 87, in setSoup
> self.setSoup(result)
> File
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
> line 164, in page5
> self.page5()
> File
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
> line 200, in __call__
> self.testRunner()
> File
> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/runner.py",
> line 21, in __call__
>
>
> any help is greatly appreciated
>
> -- John
>
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
grinder-use mailing list
grinder-use@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/grinder-use
[Attachment #5 (text/html)]
<html><head><style type="text/css"><!-- DIV {margin:0px;} \
--></style></head><body><div \
style="font-family:arial,helvetica,sans-serif;font-size:10pt"><div \
style="font-family: arial,helvetica,sans-serif; font-size: 10pt;">I have plenty of \
resources; I'm trying to speed up the scripting and avoid writing and testing regular \
expressions. The problem seems to be with nested imports, i.e. I can import re in \
__init__ but I cannot import BeautifulSoup there. I also tried importing re there \
along with BeautifulSoup to no avail. <br><br>Now that I think about it I can't \
recall if I tried initializing it w/ a getter outside of my TestRunner... I'll give \
that another go and see if it's any better. <br><br> -- \
John<br><br><br><br><br><div style="font-family: times new roman,new \
york,times,serif; font-size: 12pt;"><font face="Tahoma" size="2"><hr \
size="1"><b><span style="font-weight: bold;">From:</span></b> Travis Bear \
<travis_bear@yahoo.com><br><b><span style="font-weight: bold;">To:</span></b> \
grinder-use <grinder-use@lists.sourceforge.net><br><b><span style="font-weight: \
bold;">Sent:</span></b> Tue, June 15, 2010 4:19:14 PM<br><b><span style="font-weight: \
bold;">Subject:</span></b> [Grinder-use] BeautifulSoup (Re: fun with \
re)<br></font><br> <div style="font-family: arial,helvetica,sans-serif; font-size: \
10pt;"><div>I've hit this issue in the past as well. Moving the imports got \
things working for me.<br><br><span><span>FWIW, for performance reasons I eventually \
gave up on BeautifulSoup and went to the HtmlParser library on Sourceforge. (<a \
target="_blank" href="http://htmlparser.sourceforge.net/">http://htmlparser.sourceforge.net/</a>). \
BS runs quite fast in cpython, but was a dog-slow CPU hog in \
Jython.</span></span><br><br><br>-Travis<br><br><br></div><div style="font-family: \
arial,helvetica,sans-serif; font-size: 10pt;"><br><div style="font-family: \
arial,helvetica,sans-serif; font-size: 13px;"><font face="Tahoma" size="2"><hr \
size="1"><b><span style="font-weight: bold;">From:</span></b> Philip Aston \
<philipa@mail.com><br><b><span style="font-weight: bold;">To:</span></b> \
grinder-use@lists.sourceforge.net<br><b><span style="font-weight: \
bold;">Sent:</span></b> Tue, June 15, 2010 11:38:13 AM<br><b><span \
style="font-weight: bold;">Subject:</span></b> Re: [Grinder-use] fun with \
re<br></font><br> You're tried moving all the BS imports to \
TestRunner.__init__?<br><br>> I've read the FAQ and tried multiple import \
locations ( __call__, <br>> __init__) for re (in both my script runner and the \
login_and_blog that <br>> it calls) but I cannot get past the ValueError.<br>> \
I'm trying to use BeautifulSoup <br><span><span>> <<a target="_blank" \
href="http://www.crummy.com/software/BeautifulSoup/">http://www.crummy.com/software/BeautifulSoup/</a>> \
to handle my html </span></span><br>> heavy lifting, which works fine in both \
python and jython<br>><br>> Jython 2.5.1 (Release_2_5_1:6813, Sep 26 2009, \
13:47:54)<br>> [Java HotSpot(TM) 64-Bit Server VM (Sun Microsystems Inc.)] on \
<br>> java1.6.0_18-ea<br>> Type "help", "copyright", "credits" or "license" for \
more information.<br>> >>> from BeautifulSoup import \
BeautifulSoup<br>> >>> f = open('results_getText_1.html', 'r')<br>> \
>>> contents = f.read()<br>> >>> f.close()<br>> >>> \
soup = BeautifulSoup(contents)<br>> >>> \
soup.find(id="_33_addEntryButton")['onclick']<br>> u"location.href = 'the link I'm \
looking for...';"<br>><br>> 6/14/10 5:39:43 PM (thread 0 run 0 test 501): \
Aborted run due to <br>> Jython exception: <type 'exceptions.ValueError'>: \
('unsupported <br>> operand type', 'subpattern') [calling TestRunner]<br>> \
<type 'exceptions.ValueError'>: ('unsupported operand type', \
'subpattern')<br>> raise ValueError, ("unsupported \
operand type", op)<br>> File \
"/opt/install/jython/jython2.5.1/Lib/sre_compile.py", <br>> line 182, in \
_compile<br>> _compile(code, p.data, \
flags)<br>> File \
"/opt/install/jython/jython2.5.1/Lib/sre_compile.py", <br>> line 500, in \
_code<br>> code = _code(p, flags)<br>> \
File "/opt/install/jython/jython2.5.1/Lib/sre_compile.py", \
<br>> line 516, in compile<br>> p = \
sre_compile.compile(pattern, flags)<br>> File \
"/opt/install/jython/jython2.5.1/Lib/re.py", line 239, in <br>> \
_compile<br>> p = sre_compile.compile(pattern, \
flags)<br>> File \
"/opt/install/jython/jython2.5.1/Lib/re.py", line 239, in <br>> \
_compile<br>> return _compile(pattern, 0).sub(repl, \
string, count)<br>> File \
"/opt/install/jython/jython2.5.1/Lib/re.py", line 150, in <br>> sub<br>> \
convert = lambda(k, val): (k,<br>> \
File <br>> \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>> line 544, in <lambda><br>> self.attrs \
= map(convert, self.attrs)<br>> File <br>> \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>> line 548, in __init__<br>> tag = Tag(self, \
name, attrs, self.currentTag, self.previous)<br>> File \
<br>> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>> line 1342, in unknown_starttag<br>> \
self.unknown_starttag(tag, attrs)<br>> File \
"/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line <br>> 333, in \
finish_starttag<br>> self.finish_starttag(tag, \
attrs)<br>> File \
"/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line <br>> 291, in \
parse_starttag<br>> k = \
self.parse_starttag(i)<br>> File \
"/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line <br>> 133, in \
goahead<br>> self.goahead(0)<br>> \
File "/opt/install/jython/jython2.5.1/Lib/sgmllib.py", line <br>> \
99, in feed<br>> SGMLParser.feed(self, \
markup)<br>> File <br>> \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>> line 1184, in _feed<br>> \
self._feed(isHTML=isHTML)<br>> File <br>> \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>> line 1142, in __init__<br>> \
self._feed(isHTML=isHTML)<br>> File <br>> \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>> line 1142, in __init__<br>> \
BeautifulStoneSoup.__init__(self, *args, **kwargs)<br>> \
File <br>> "/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/BeautifulSoup.py", \
<br>> line 1517, in __init__<br>> self.soup = \
BeautifulSoup(result.getText())<br>> File <br>> \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
<br>> line 87, in setSoup<br>> \
self.setSoup(result)<br>> File <br>> \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
<br>> line 164, in page5<br>> \
self.page5()<br>> File <br>> \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/login_add_blog.py", \
<br>> line 200, in __call__<br>> \
self.testRunner()<br>> File <br>> \
"/opt/install/grinder/grinder-3.4/test/blog/./fury-file-store/current/runner.py", \
<br>> line 21, in __call__<br>><br>><br>> any help is greatly \
appreciated<br>><br>> -- \
John<br>><br><br>------------------------------------------------------------------------------<br>ThinkGeek \
and WIRED's GeekDad team up for the Ultimate <br>GeekDad Father's Day Giveaway. ONE \
MASSIVE PRIZE to the <br>lucky parental unit. See the prize list and enter to \
win: <br><span><span><a target="_blank" \
href="http://p.sf.net/sfu/thinkgeek-promo">http://p.sf.net/sfu/thinkgeek-promo</a></span></span><br>_______________________________________________<br>grinder-use \
mailing list<br><a rel="nofollow" ymailto="mailto:grinder-use@lists.sourceforge.net" \
target="_blank" href="mailto:grinder-use@lists.sourceforge.net">grinder-use@lists.sourceforge.net</a><br><a \
rel="nofollow" target="_blank" \
href="https://lists.sourceforge.net/lists/listinfo/grinder-use">https://lists.sourceforge.net/lists/listinfo/grinder-use</a><br></div></div>
</div><br>
</div></div>
</div><br>
</body></html>
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
grinder-use mailing list
grinder-use@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/grinder-use
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic