Re: [Hampshire] extracting phrases from a file.

Top Page

Reply to this message
Author: Alan Pope
Date:  
To: Hampshire LUG Discussion List
Subject: Re: [Hampshire] extracting phrases from a file.
On 12 September 2011 10:54, James Courtier-Dutton
<james.dutton@???> wrote:
>> lynx -dump --hiddenlinks=ignore foo.html
>>
>> Will dump it to stdout in plain text form with URLs removed.
>>
>
> Sorry, I was not very clear.
> I wish to keep the "some url" bits, and get rid of all the "some junk" bits.
> I.e. I wish to keep the contents of the href only, and drop everything
> else, e.g. the href text itself.
> I wish to end up with a file listing all the urls.
>


Omit the '--hiddenlinks=ignore' then. It will dump out all the URLs at the end.

Al.