[prev in list] [next in list] [prev in thread] [next in thread]
List: mozilla-documentation
Subject: Re: Regarding my FAQ
From: fta () oleane ! net (Fabien Tassin)
Date: 1998-05-04 15:07:47
[Download RAW message or body]
On 03 May 1998 16:15:02 +0200, Thomas Martin Widmann <viralbus@saratoga.daimi.aau.dk> \
wrote:
>
> while (<>) {
> s/<a href="(.*?)">(.*?)<\/a>/$2 ($1)/gi;
> print;
> }
>
> This will convert
> You can find Perl <A HREF="wherever">here</A>,
> to
> You can find Perl here (wherever),
>
> After preprocessing, you could then use the usual HTML->text
> converter (or expand it to do all conversion on its own :).
This will not work if there are multiline tags, or some (useless) options..
What about this ?
$ perl -pe 'BEGIN { $i = 1 };
undef $/;
while (s|<a\s.*?href="(.*?)"[^>]*>(.*?)</a>|$2 \[$i\]|smoi) {
$ref[$i++] = $1
}
END {
print "--\nReferences:\n\n";
for ($a = 1; $a <= $#ref; $a++) {
print "[$a]: $ref[$a]\n";
}
}' your_doc.html | your_own_html2text_translator
or in a more condensed/ugly form:
$ perl -pe 'BEGIN{$i=1}; undef $/; while (s|<a\s.*?href="(.*?)"[^>]*>(.*?)</a>|$2 \
\[$i\]|smoi) {$ref[$i++]=$1}; END {print "--\nReferences:\n\n"; for \
($a=1;$a<=$#ref;$a++){print "[$a]: $ref[$a]\n"; } }' your_doc.html
It can easily be extended to avoid duplicates and to give only full URLs..
--
Fabien Tassin -+- fta@oleane.net
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic