[prev in list] [next in list] [prev in thread] [next in thread] 

List:       fedora-list
Subject:    Re: downloading a complete web page without using a browser...
From:       Samuel Sieb <samuel () sieb ! net>
Date:       2021-07-06 6:22:24
Message-ID: b66dfa0b-4ce0-f347-f5b7-8bcc4c689ed4 () sieb ! net
[Download RAW message or body]

On 2021-07-05 10:30 p.m., Thomas Stephen Lee wrote:
> On Mon, Jul 5, 2021 at 12:26 PM Samuel Sieb <samuel@sieb.net> wrote:
>>
>> On 2021-07-03 8:02 p.m., dwoody5654@gmail.com wrote:
>>> the url I am trying to download does not have an extension ie. no '.htm' such
>>> as:
>>> https://my.acbl.org/club-results/details/338288
>>>
>>> wget does not download the correct web page.
>>
>> I tried it and it worked, sort of.  The problem is that you want to
>> download everything to view it offline, but the site my.acbl.org has a
>> robots.txt that says "no robots allowed".  So wget respects that and
>> will not download any required files from that site other than the
>> initial page.  curl probably has the same issue.
>> _______________________________________________
> 
> for wget
> https://gist.github.com/u0d7i/87aa962311f2a7c739aa

Ok, that solves it.  I was able to download everything and opening the 
resulting file in Firefox didn't have any network access.  I was able to 
see the entire page and even interact with it somewhat.
wget -e robots=off -EHkp https://my.acbl.org/club-results/details/338288
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic