Working with HTML

This section contains a number of tips and caveats you may find useful in working with HTML-based documents, such as in a JTextPane printed with J2TextPrinter.

View your HTML in a JTextPane on the screen first.  The Java JTextPane component is not as capable as your favorite browser or HTML generation tool in displaying HTML.  You will find that many tags and subtags are either not supported or don't work the way you'd expect.  You don't need to display your JTextPane on the screen in order to print, but if you are having problems printing HTML documents, try displaying your JTextPane in a JFrame on the screen and see how it handles your HTML.  The vast majority of "printing" problems people have with their HTML documents are really because JTextPane can't properly display them.  If JTextPane can't display it, we can't print it.

Use the latest JDK.
 Sun has been steadily fixing bugs, adding features, and improving performance in their HTML support for JTextPane.  The HTML support under JDK 1.2 was practically unusable whereas under JDK 1.4 it's OK and better yet under subsequent JDK's.

Check out Sun's Java Bug Parade.  Search the the keywords "HTML", "JTextPane", and/or "print" in Sun's Java Bug Parade and you will get some idea of the many problems and missing features developers have encountered.  However, Bug Parade is also a goldmine of workarounds and good advice that can greatly help you overcome your initial difficulties.

Writing your own HTML works best.  JTextPane has enough capabilities that you can construct fairly elaborate documents and complex layouts if you stay carefully within the features and working implementation of JTextPane HTML.  Things are less certain if you throw HTML pages taken from the web, or produced by your favorite HTML tool, or generated by your favorite XML and XSLT generators.  In the latter cases, you may need to modify the HTML code in order to get it to print reasonably close to what you want.

Avoid putting your entire document inside one big HTML table.  Many page layout programs use HTML tables as their primary page layout tool and will place the entire HTML document inside one cell of a single HTML table in order to control width, alignment, spacing, etc.  While J2TextPrinter can print such a document, it gets no help from Java in determining where page breaks go and instead must systematically test every character in order to perform pagination.  The result will slow down printing considerably.  For a 20 page HTML document inside a single HTML table cell, it can take about a half minute for the print dialog (or print preview) to come up.

Use relative not absolute font sizes.  Prior to JDK 1.4, absolute font sizes, e.g. <font size=1>, displayed and printed one size too big relative to what you see in browsers, see Bug Parade 4285636 .  However relative font sizes, e.g. <font size=+1>, work and are consistent across all JDKs.  You can also use the HTML tags <big> and <small>.  Here is a guide to using relative font sizes:
    <font size=+3> is approximately 36 point = <big><big><big>
    <font size=+2> is approximately 24 point = <big><big>
    <font size=+1> is approximately 18 point = <big>
    <font size=+0> is approximately 14 point
    <font size=-1> is approximately 12 point = <small>
    <font size=-2> is approximately 10 point = <small><small>
    <font size=-3> is approximately  8 point = <small><small><small>

If layout doesn't wrap right, try setWYSIWYG(true).  When you change the width of your browser window, the browser tries to reflow the text so that it wraps inside the window.  Likewise, J2TextPrinter will reflow (a copy of) your JTextPane in order to make it fit the available printed page width.  However, HTML tables and images which have width tags or <pre> (pre-formatted) text can't be rewrapped, and this can cause the new layout to not look the way you want.  If so, consider using textPrinter.setWYSIWYG(true) and set the width of your JTextPane to exactly the width you want.  This will tell J2TextPrinter not to reflow your document and instead it will shrink-to-fit your JTextPane as necessary to make it fit the printed page.

Check all your width tags.  J2TextPrinter will shrink-to-fit your JTextPane so that it fits the printed page width.  The minimum width of your JTextPane is often determined by the HTML tables, images, and <pre> (pre-formated) text it contains.  The widths of these may be implicit or may be given explicitly using width subtags, either in pixels or as a percentage of the screen.  Because tables can contain other tables and images, the overall width of your JTextPane may be a complex calculation.  In addition, width specifications can be self-contradictory, e.g., you can specify an overall table with a width that is more or less than the sum of the widths of its parts.  In this case, most browsers will do some reasonable compromise, but JTextPane can sometimes get very confused.  If you are seeing unexpected layout results, it is a good idea to check all your width tags and make sure the sizes make sense and that they add up properly.

Beware style runs that shift left or right.  This was one of the worst problems with JTextPane, making it difficult to create and print nicely formatted documents with interspersed bold and italic words.  See Bug Parade 4724061 and Bug Parade 4352983.  Note that we have an elaborate but serviceable workaround for this problem, see J2TextPrinter Known Problems.  This bug has finally been fixed in JDK 1.5.

Beware text clipped at right edge.
  Sun determined that this and the style run shift problem are the same underlying bug.  See Bug Parade 4352983 .  This bug has now been fixed in JDK 1.5.

Only use border=0 or border=2.  Java used to display and print HTML tables either with no border (same as border=0, which is the default), or with border=2 if you specify border=n for any n>0 , see Bug Parade 4174871.  This bug is now fixed in JDK 1.5.

Use the standard Java fonts.  You can specify HTML fonts using the Java core Font names Serif, SansSerif, Monospaced, etc. or using the exact same names as the core fonts used by Java, which on Windows are: Times New Roman, Ariel, Courier, Symbol, and WingDings.  For all other designations, Java will substitute Sans Serif and as a result, the "Variable Width" font setting common in HTML documents winds up as SansSerif instead of Serif as in the standard browsers.

Don't use insertIcon to insert an image in HTML.
  However, you can easily use an HTML <img src=xxx> tag to accomplish the same thing.  See Bug Parade 4671653.

Read HTML file correctly so it can find accompanying images.
 See Bug Parade 4294902 .     
Don't use:
    java.net.URL url = new java.net.URL("file",null,fileName);

Instead use:
    java.net.URL url = new File(fileName).toURL();

Make sure HTML file is completely loaded before printing.  See sample code for J2TextPrinterTestApplication which shows how a custom HTMLEditorKit can be defined for addressing this problem.  This technique is also described in the J2TextPrinter section of this documentation under "Reading HTML from a file".

If an HTML table border displays but won't print, try setCloningUsed(false).  This is because the HTML border tag is dropped by standard Java serialization.  See Bug Parade 4691546. This is reported as fixed in JDK 1.5

Don't use <center> if using HTMLEditorKit.   Instead, use either <div align=center> or <p align=center> which both work OK.  The <center> tag does work you have the JTextPane read the same HTML from a file using setPage().  See Bug Parade 4671625.

Don't use <META content="text/html"> in <head> section if using setText.  In this case the body of the JTextPane is not displayed.  See Bug Parade 4695909.  This is reported as fixed in JDK 1.5.

Eliminate non-standard HTML tags.  Java JTextPane has problems with certain non-standard tags introduced by some HTML editors.  For example, Netscape Composer likes to insert <TBODY>....</TBODY> tags in HTML tables.  JTextPane will display these unknown tags with a special tag indicator graphic in the display (which is enough reason to eliminate them in any event).  When printing, JTextPane is sometimes OK with these tags (and doesn't render the tag indicator graphics in the printed output), but other times these tags cause JTextPane rendering to fail.with NullPointerExceptions or other anomalous behavior. As a general rule we recommend eliminating all HTML tags that JTextPane is unable to display.

Copyright 2009, Wildcrest Associates (http://www.wildcrest.com )