[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xom-interest
Subject:    [XOM-interest] Get all urls
From:       "Aaron Green" <subnetrx () gmail ! com>
Date:       2006-08-12 15:57:44
Message-ID: 42af3c940608120857x46bb0871jbb059348df537cc1 () mail ! gmail ! com
[Download RAW message or body]

I'm performing a query on a document to return all anchors in the document.
This returns a list of nodes.  What I don't know is how to get attributes,
such as href from this list of nodes.  This may not even be the correct way
to do this.  I just want to get a list of all href attributes in a
document.   I'm working on page scraping a company intranet to be put into a
cms and need to get pages that are actively linked to, run them through
tagsoup, write content to a file, and go to the next url.

-- 
Aaron
_______________________________________________
XOM-interest mailing list
XOM-interest@lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/xom-interest
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic