[prev in list] [next in list] [prev in thread] [next in thread] 

List:       fedora-list
Subject:    Re: downloading a complete web page without using a browser...
From:       Thomas Stephen Lee <lee.iitb () gmail ! com>
Date:       2021-07-06 5:42:01
Message-ID: CAG7s96WvUkL+nuPGx8CioAjqGTgRfvJNijSKaMC-CS=H+nfJ9w () mail ! gmail ! com
[Download RAW message or body]

On Mon, Jul 5, 2021 at 12:26 PM Samuel Sieb <samuel@sieb.net> wrote:
>
> On 2021-07-03 8:02 p.m., dwoody5654@gmail.com wrote:
> > the url I am trying to download does not have an extension ie. no '.htm' such
> > as:
> > https://my.acbl.org/club-results/details/338288
> >
> > wget does not download the correct web page.
>
> I tried it and it worked, sort of.  The problem is that you want to
> download everything to view it offline, but the site my.acbl.org has a
> robots.txt that says "no robots allowed".  So wget respects that and
> will not download any required files from that site other than the
> initial page.  curl probably has the same issue.
> _______________________________________________

for wget
https://gist.github.com/u0d7i/87aa962311f2a7c739aa
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic