Re: [Hampshire] merging html files into one single file

Top Page
Author: Hugo Mills
Date:  
To: ed, Hampshire LUG Discussion List
CC: 
Subject: Re: [Hampshire] merging html files into one single file

Reply to this message
gpg: failed to create temporary file '/var/lib/lurker/.#lk0x5857c100.hantslug.org.uk.29143': Permission denied
gpg: keyblock resource '/var/lib/lurker/pubring.gpg': Permission denied
gpg: Signature made Fri Jan 25 16:52:14 2008 GMT
gpg: using DSA key 20ACB3BE515C238D
gpg: Can't check signature: No public key
On Fri, Jan 25, 2008 at 04:39:44PM +0000, ed wrote:
> I run a wiki app (not negotiable!) and want an option to print all, or
> better still, send to file for final editing and presentation.
>
> It will export to html, which results in a zip file full of html docs.
> Extracting this to a convenient folder results in simple viewing and
> following in my web browser (no surprises there), but I want to go the
> next stage and combine all pages. I can insert file loads of times in
> open office, but this needs to be done regularly with over 100 pages
> each time.
>
> Can anyone help? I'd like either a long open office, html or pdf file
> at the end of it.


Well, it's not very pleasant (as in, breaks the HTML spec), but
have you simply tried using cat to join all the files together first,
then import a single file into OpenOffice?

Failing that, something like the following arcana might work,
depending on how the <body> and </body> tags are placed on distinct
lines:

echo "<html><head></head><body>" >import.html
cat *.html | sed -ne '/<body>/,/</body>/p' -e '/</?body>/d' >>import.html
echo "</body></html>" >>import.html

Failing that, a bit of perl to do much the same kind of
transformation (strip out everything in the <body> for each input
file) would do the trick -- you only need to get as sophisticated as
is absolutely necessary for parsing the specific case you have.

Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
    --- Klytus! Are your men on the right pills? Maybe you should ---    
                         execute their trainer!