_____________________________ WRITING HTML ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ A "quick and dirty" guide to creating a HTML document by: M†rten Lindstr”m HTML really IS very simple, which may not be immediately evident when looking at the HTML specification documents. In this presentation I will skip over most of the theory and furthermore limit it to (most of the) HTML 2.0 features. If there is an interest for it I could perhaps write a further article, more in-depth with full coverage of both HTML 2.0 and 3.2. HOW TO TURN PLAIN TEXT INTO HTML ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ Just take an existing plain ("ascii") text and do the following: 1) Replace all occurrences of & with & including the semicolon. 2) Now, in the same way, replace every < with < and perhaps, to be sure, also > with > the second line is unnecessary, so it can be removed. 3) Insert a "TITLE" (to be used for window caption by browser) at the start of the text: Some text for caption of document window This also is what a browser will use to determine that the text at all IS a HTML document. 4) Insert

before each and every PARAGRAPH of your text. Remember that the browser will IGNORE ALL NEWLINES in your source (instead formatting the text according to the current window width) and will split your text into paragraphs based solely on these

tags. 5) If you have used any Atari-specific characters (the ones in the second half of the character set - including British pound sign and "non- English" letters) then you must also convert these into the "ANSI" (aka ISO 8859-1 aka Latin-1) character set. For instance using my ANSIFIER program in Ictari 39. Done! (Now inspect your text with CAB or HTML-Browser!) Any newlines and extra spaces (above one between each word) will be ignored by the browser, so you are free to insert as many as you like, to improve the readability of the plain source text. Note on start tags and end tags: Most types of elements, like the title, need BOTH start and end tag ( and ) while a few, like paragraphs, don't. (It is enough to start each paragraph with

though you optionally also COULD end it with

.) There are even some elements that NEVER have an end tag, simply because they don't contain any document text - see
and below. Furthermore, in clean HTML, elements can be contained within each other but should never overlap. For instance, in order to use both bold AND italics style on some text you could write: some text An i element cleanly within b or some text A b element cleanly within i but the following versions, on the other hand some text and some text are not clean HTML, although most browsers might understand them anyway. REFINEMENTS ÿÿÿÿÿÿÿÿÿÿÿ Pre-formatted text ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ Instead of the

tags, preceding every paragraph, you could have merely preceded the whole text - after the title element - with a

 tag and
 succeeded it with  
. This would suppress automatic word-wrapping, causing the browser to preserve all spaces and newlines literally and use a monospaced font for the text. I.e. behaving essentially like the familiar old ascii text viewer. More typically, you would use
  and  
tags only around selected parts of the text, such as program listings. Headings ÿÿÿÿÿÿÿÿ To turn a paragraph into a heading, just remove the

before it and instead enclose it in

and

, thus:

Some heading in your document

With H2 instead of H1 you will get a smaller heading, H3 results in an even smaller heading, down to H6 for the smallest possible heading. (A recommendation is to not skip heading levels, i.e. after a H1 don't go down to H3 before you have used H2.) Lists ÿÿÿÿÿ A list, bulleted or numbered, can be written thus: or
    In browser this will appear as
  1. Text for first item 1. Text for first item
  2. second item 2. second item
  3. third ... etc. 3. third ... etc.
UL stands for Unordered (i.e. bulleted) List, OL stands for Ordered (i.e. numbered) List. Each LI (List Item) element could also contain multiple paragraphs or even sub-lists (but not headings). The indentation I have used on the
  • elements is of course purely for readability of the source text and won't affect how the browser displays them (they will typically be displayed indented anyway). Horizontal Rules ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ Just insert
    where you want a horizontal division line in the text. In monochrome it will simply be a thin black line, while in colour most browsers make it appear as a three-dimensional groove (achieved by using two colours for it: top= dark gray or black, bottom = white; the text back- ground being not white but light gray). Images ÿÿÿÿÿÿ An image can be inserted anywhere in the text flow with: For really GOOD HTML, you should also add, within the IMG tag, an extra attribute: ALT="Text that is displayed if image not shown". Note: ALT="" is entirely appropriate to use with pure adornment images. The ALT text should _NOT_ be a picture DESCRIPTION but an ALTERNATIVE. GIF is the most widely recognized picture file format, while JPEG is understood by most newer browsers (this SHOULD include CAB (?)). Only now, in October this year (96), was PNG formally adopted by W3C (the World Wide Web Consortium), but it will probably replace GIF eventually. Hyperlinks ÿÿÿÿÿÿÿÿÿÿ Any image or piece of text could also be made into a hyperlink by enclosing it in and For instance: This is a clickable link A link doesn't necessarily have to lead to another HTML file. You could make links to ANY kind of file, though the browser may not be able to display it, of course. Plain ("ascii") text files as well as (GIF) images normally ARE displayed directly by the browser, others may be passed by the browser to some other program (if a protocol for this has been established). Note: When CAB displays a plain text file it treats characters 160-255 as ANSI (like in a HTML file) rather than Atari. Not the ideal behaviour for an Atari browser I would say. ----- More generally, an A element ("Anchor") can be jumped both TO and FROM, making it possible to jump not only between different files but within a HTML file. For an anchor to serve as a starting point, a HREF attribute must be present, as above; An anchor serving as a destination must have a NAME attribute, for instance:

    Conclusions

    To enable a link to this anchor, some other anchor, in the same document, could be written as: See conclusions Note the '#' character. Links could also be made from other documents by preceding the '#' with a relative pathname, e.g.: See conclusions Text Styles in HTML ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ Enclosing text with and will render it in BOLD type; similarly and for italics and and for a monospaced (TeleType) font. However, this TYPOGRAPHIC markup is slightly out of line with the rest of HTML (at least until anomalies like the FONT element of HTML 3 appeared - that should become obsolete with the expected addition of STYLE SHEETS). HTML mainly tries to concentrate on the LOGICAL purpose of the text. And so there is an alternative logical or "idiomatic" markup system: ... for emphasized italics ... for strongly emphasized bold ... for book titles etc. italics ... for variables (in syntax descriptions) italics ... for some code element monospaced ... for text typed by user (in eg manuals) monospaced ... for some sample of literal characters monospaced In the last column I have listed how browsers typically display these elements, and, of course, the styles overlap both with each other and with the typographic markup. So what's the point of all this? Answers: 1) Logical markup allows the browsing software (and human if the software allows it) to CHOOSE how each element type is to be displayed - for instance using different colours. 2) It may simplify automatic processing of the text, e.g. by indexers or text analysers. Still, it should be said that many or most people will probably never use anything but the typographic markup, because it reminds them of the secure and old-fashioned word-processor they are used to (plus that the typographic tags admittedly are a little shorter than the idiomatic ones). Typographic markup is of course also what programs automatically converting from word-processor file formats will always have to use. Comments ÿÿÿÿÿÿÿÿ COMMENTS (ignored by the browser) can be inserted anywhere in a HTML document enclosed in For example: PATHS FOR IMAGES AND LINKS ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ Just about any familiar old relative DOS path and filename is acceptable in the and tags, EXCEPT that FORWARD slash characters (/) should be used instead of the DOS backslashes (\) for separator. Browsers on Atari and PC will probably understand backslashes too, but e.g. a Unix browser may be more pleased to see forward slashes. You should probably also try to use UPPERCASE letters only, for your path and file names, since this is how names on files and folders are stored by (GEM)DOS. Even though TOS/DOS/Windows are case insensitive, Unix isn't. Above remarks are for the event that you transfer your files to Unix or something, plus you might as well learn proper HTML (actually URLs = "Web paths") from the start. It is quite OK to use even the familiar old DOS double-dot ".." for moving up one folder level. For instance ../INDEX.HTML Paths are counted from where the current HTML document is located. PROPOSITION FOR ICTARI ARTICLES ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ May I here make the suggestion that pictures and sub-documents of any Ictari article in HTML format be normally placed in a folder with the name of the HTML article but with the extension .SUB (or .PIX) instead of .HTM For instance a document ARTICLE.HTM would have its sub-documents (and pictures (?)) in the folder ARTICLE.SUB This would tidy the disk directories so that the main .HTM file would always easily be found. And, regardless of how many pictures and sub- documents referred to by it, there would in most cases only be two items - the HTM file and a SUB folder - to deal with during disk operations such as move or copy. In order to convert an existing HTML document into this format you will need to search it for occurrences of the SRC attribute (in IMG tags) and HREF attribute (in A tags) and change the given paths appropriately.