[prev in list] [next in list] [prev in thread] [next in thread]
List: xom-interest
Subject: [XOM-interest] Get all urls
From: "Aaron Green" <subnetrx () gmail ! com>
Date: 2006-08-12 15:57:44
Message-ID: 42af3c940608120857x46bb0871jbb059348df537cc1 () mail ! gmail ! com
[Download RAW message or body]
I'm performing a query on a document to return all anchors in the document.
This returns a list of nodes. What I don't know is how to get attributes,
such as href from this list of nodes. This may not even be the correct way
to do this. I just want to get a list of all href attributes in a
document. I'm working on page scraping a company intranet to be put into a
cms and need to get pages that are actively linked to, run them through
tagsoup, write content to a file, and go to the next url.
--
Aaron
_______________________________________________
XOM-interest mailing list
XOM-interest@lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/xom-interest
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic