yep, that's me



 

FAQ: How to De-Word

Microsoft Word documents can be converted into HTML by using Save As... and select HTML format. However, the HTML is filled with junk code from Microsoft. Here's how to quickly strip out the junk code, so that you're left with a clean HTML file (it has only the basic HTML tags.) It also preserves tables, lists, and URLs.

  1. Open the document in Word.
  2. Use Save As... and select Save as Type... and change to Web Page (htm, html). Save the document.
  3. Open the HTML file in Homesite. It will be cluttered with Word's junk code.
  4. Select Tools | Codesweeper and select Allair Default HTML Tidy Settings.
  5. Select Configure Codesweeper.
  6. In the left pane, find HTML Tidy Settings and expand the item (click the plus mark.) Click the sub-item: Allair Default HTML Tidy Settings.
  7. In the right pane, in the long list of items, turn off every option. Turn on Clean up Word 2000-generated HTML.
  8. Click okay.
  9. You can set this as your default settings so that whenever you run CodeSweeper, it removes Word junk code. Once again, select Tools | Codesweeper, select Allair Default HTML Tidy, select Configure Codesweeper, and this time, set Allair Default HTML Tidy Settings as default.
  10. Finally, run Codesweeper (Tools | Default Codesweeper).
  11. Ta-da! You get a clean HTML file.

If you try a test case, you'll find that you can create a Word document with headings, body content, lists, tables, and URLs. Codesweeper will preserve all of that.

Benefits

  • This is a great way to make tables. Word's table tool is very good. Create the table, save the Word doc as HTML, and run Codesweeper.
  • Regrettably, Codesweeper doesn't preserve illustrations in the Word file. However, that's easy to fix in Homesite. Just drag-and-drop the illustrations into place.

Tips

HTML Tidy is actually a separate program within Homesite. Although it can be used from within Homesite, it's more useful to run it outside of Homesite:

  • HomeSite makes available only a limited set of options for HTML Tidy; it's more flexible when you can set the full set of options.
  • HomeSite crashes sometimes when running HTML Tidy to clean up a Word file.
  • Learn more about HTML Tidy at www.w3.org/People/Raggett/tidy/
Is this Page Useful? Vote!

Follow  
Me! 

andreas at Twitter     andreas' blog     andreas' Page at Facebook     andreas' Group at Facebook     andreas' Fan Page at Facebook     andreas' channel on Youtube     andreas at LinkedIn     Bookmark and share

Updates: andreas.com newsletter

I add new pages every month. Sign up with your email and I'll drop you a note (not more than once a month) about new pages. (See more about the newsletter.)



home | web | jobs | FAQs | other | me | sitemap | legal | © 1994-2008 andreas.com