[prev in list] [next in list] [prev in thread] [next in thread] 

List:       wget
Subject:    Re: wget command help please.
From:       Ian Abbott <abbotti () mev ! co ! uk>
Date:       2002-04-30 18:07:04
[Download RAW message or body]

On Tue, 30 Apr 2002 12:07:34 -0400 (EDT), you wrote:

>What I really want this to do, is to *only* keep the files that are in such
>a directory (index.ext? , which is always the same).
>
>I've tried:
>using: --accept '.*/index.ext\?/.*' (and a few variations of it) and I 
>the message:

Wget uses simple shell-globbing wildcards, rather than full-blown
regular expressions and also matches the tail end of filenames, so
you probably want to use --accept '/index.ext\?/*'.

This doesn't work the same for the current CVS version of Wget (I
take it you're using Wget 1.8.1) as it names the files differently.
What gets saved as 'index.ext?/foo/stories/somefile.html' by Wget
1.8.1, gets saved as 'index.ext?%2Ffoo%2Fstories%2Fsomefile.html' by
the current CVS version. I can't find a --accept pattern that
actually seems to work for the current CVS version. (You gave a clue
in your post as to which website you were actually using, which made
it *much* easier for me to see what Wget actually does!)

You might also want to try the --include-directories (-I) option
(which takes a list of directories (with wildcards) to accept)
and/or the --no-parent (-np) option.
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic