[prev in list] [next in list] [prev in thread] [next in thread] 

List:       wget
Subject:    Re: How to stop infinite recursion?
From:       Hrvoje Niksic <hniksic () xemacs ! org>
Date:       2006-05-28 0:02:36
Message-ID: 87zmh32ezn.fsf () xemacs ! org
[Download RAW message or body]

Robert Nicholson <robert@elastica.com> writes:

> When wget is traversing a url what stops it visiting that url again?

It keeps a table of visited URLs.

> and assuming it checks the url is it only checking for the exact
> string?

It is.


> ie. different url but same response because the url it's following
> the second time includes additional query parameters.

In such a case Wget can fetch the same resource more than once.  In
the worst case, where new URLs are continually created based on old
requests, Wget can fall into a redirection "black hole" -- but so can
any crawler in the presence of dynamically generated URLs.  Wget's
checks could be smarter, but I don't think there's a general solution
to that problem.
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic