Index: head/en_US.ISO8859-1/books/fdp-primer/doc-build/chapter.xml
===================================================================
--- head/en_US.ISO8859-1/books/fdp-primer/doc-build/chapter.xml (revision 51562)
+++ head/en_US.ISO8859-1/books/fdp-primer/doc-build/chapter.xml (revision 51563)
@@ -1,528 +1,528 @@
The Documentation Build ProcessThis chapter covers organization of the documentation build
process and how &man.make.1; is used to control it.Rendering DocBook into OutputDifferent types of output can be produced from a single
DocBook source file. The type of output desired is set with the
FORMATS variable. A list of known formats is
stored in KNOWN_FORMATS:&prompt.user; cd ~/doc/en_US.ISO8859-1/books/handbook
&prompt.user; make -V KNOWN_FORMATS
Common Output FormatsFORMATS ValueFile TypeDescriptionhtmlHTML, one fileA single book.html or
article.html.html-splitHTML, multiple filesMultiple HTML files, one for
each chapter or section, for use on a typical web
site.pdfPDFPortable Document Format
The default output format can vary by document, but is
usually html-split. Other formats are chosen
by setting FORMATS to a specific value.
Multiple output formats can be created at a single time by
setting FORMATS to a list of formats.Build a Single HTML Output File&prompt.user; cd ~/doc/en_US.ISO8859-1/books/handbook
&prompt.user; make FORMATS=htmlBuild HTML-Split and PDF Output
Files&prompt.user; cd ~/doc/en_US.ISO8859-1/books/handbook
&prompt.user; make FORMATS="html-split pdf"The &os; Documentation Build ToolsetThese are the tools used to build and install the
FDP documentation.The primary build tool is &man.make.1;, specifically
Berkeley Make.Package building is handled by &os;'s
&man.pkg-create.8;.&man.gzip.1; is used to create compressed versions of
the document. &man.bzip2.1; archives are also supported.
&man.tar.1; is used for package building.&man.install.1; is used to install the
documentation.Understanding Makefiles in the
Documentation TreeThere are three main types of Makefiles
in the &os; Documentation Project tree.Subdirectory
Makefiles simply pass
commands to those directories below them.Documentation
Makefiles describe the
documents that are produced from this
directory.Make
includes are the glue that perform the document
production, and are usually of the form
doc.xxx.mk.Subdirectory MakefilesThese Makefiles usually take the form
of:SUBDIR =articles
SUBDIR+=books
COMPAT_SYMLINK = en
DOC_PREFIX?= ${.CURDIR}/..
.include "${DOC_PREFIX}/share/mk/doc.project.mk"The first four non-empty lines define the &man.make.1;
variables SUBDIR,
COMPAT_SYMLINK, and
DOC_PREFIX.The SUBDIR statement and
COMPAT_SYMLINK statement show how to
assign a value to a variable, overriding any previous
value.The second SUBDIR statement shows how a
value is appended to the current value of a variable. The
SUBDIR variable is now articles
books.The DOC_PREFIX assignment shows how a
value is assigned to the variable, but only if it is not
already defined. This is useful if
DOC_PREFIX is not where this
Makefile thinks it is - the user can
override this and provide the correct value.What does it all mean? SUBDIR
mentions which subdirectories below this one the build process
should pass any work on to.COMPAT_SYMLINK is specific to
compatibility symlinks (amazingly enough) for languages to
their official encoding (doc/en would
point to en_US.ISO-8859-1).DOC_PREFIX is the path to the root of
the &os; Document Project tree. This is not always that easy
to find, and is also easily overridden, to allow for
flexibility. .CURDIR is a &man.make.1;
builtin variable with the path to the current
directory.The final line includes the &os; Documentation Project's
project-wide &man.make.1; system file
doc.project.mk which is the glue which
converts these variables into build instructions.Documentation MakefilesThese Makefiles set &man.make.1;
variables that describe how to build the documentation
contained in that directory.Here is an example:MAINTAINER=nik@FreeBSD.org
DOC?= book
FORMATS?= html-split html
INSTALL_COMPRESSED?= gz
INSTALL_ONLY_COMPRESSED?=
# SGML content
SRCS= book.xml
DOC_PREFIX?= ${.CURDIR}/../../..
.include "$(DOC_PREFIX)/share/mk/docproj.docbook.mk"The MAINTAINER variable allows
committers to claim ownership of a document in the &os;
Documentation Project, and take responsibility for maintaining
it.DOC is the name (sans the
.xml extension) of the main document
created by this directory. SRCS lists all
the individual files that make up the document. This should
also include important files in which a change should result
in a rebuild.FORMATS indicates the default formats
that should be built for this document.
INSTALL_COMPRESSED is the default list of
compression techniques that should be used in the document
build. INSTALL_ONLY_COMPRESS, empty by
default, should be non-empty if only compressed documents are
desired in the build.The DOC_PREFIX and include statements
should be familiar already.&os; Documentation Project
Make Includes&man.make.1; includes are best explained by inspection of
the code. Here are the system include files:doc.project.mk is the main project
include file, which includes all the following include
files, as necessary.doc.subdir.mk handles traversing of
the document tree during the build and install
processes.doc.install.mk provides variables
that affect ownership and installation of documents.doc.docbook.mk is included if
DOCFORMAT is docbook
and DOC is set.
-
+ doc.project.mkBy inspection:DOCFORMAT?= docbook
MAINTAINER?= doc@FreeBSD.org
PREFIX?= /usr/local
PRI_LANG?= en_US.ISO8859-1
.if defined(DOC)
.if ${DOCFORMAT} == "docbook"
.include "doc.docbook.mk"
.endif
.endif
.include "doc.subdir.mk"
.include "doc.install.mk"
-
+ VariablesDOCFORMAT and
MAINTAINER are assigned default values,
if these are not set by the document make file.PREFIX is the prefix under which the
documentation building tools
are installed. For normal package and port installation,
this is /usr/local.PRI_LANG should be set to whatever
language and encoding is natural amongst users these
documents are being built for. US English is the
default.PRI_LANG does not affect which
documents can, or even will, be built. Its main use is
creating links to commonly referenced documents into the
&os; documentation install root.
-
+ ConditionalsThe .if defined(DOC) line is an
example of a &man.make.1; conditional which, like in other
programs, defines behavior if some condition is true or if
it is false. defined is a function which
returns whether the variable given is defined or not..if ${DOCFORMAT} == "docbook", next,
tests whether the DOCFORMAT variable is
"docbook", and in this case, includes
doc.docbook.mk.The two .endifs close the two above
conditionals, marking the end of their application.
-
+ doc.subdir.mkThis file is too long to explain in detail. These notes
describe the most important features.
-
+ VariablesSUBDIR is a list of
subdirectories that the build process should go further
down into.ROOT_SYMLINKS is the name of
directories that should be linked to the document
install root from their actual locations, if the current
language is the primary language (specified by
PRI_LANG).COMPAT_SYMLINK is described in
the
Subdirectory Makefile
section.
-
+ Targets and MacrosDependencies are described by
target:
dependency1 dependency2
... tuples, where to build
target, the given
dependencies must be built first.After that descriptive tuple, instructions on how to
build the target may be given, if the conversion process
between the target and its dependencies are not previously
defined, or if this particular conversion is not the same as
the default conversion method.A special dependency .USE defines
the equivalent of a macro._SUBDIRUSE: .USE
.for entry in ${SUBDIR}
@${ECHO} "===> ${DIRPRFX}${entry}"
@(cd ${.CURDIR}/${entry} && \
${MAKE} ${.TARGET:S/realpackage/package/:S/realinstall/install/} DIRPRFX=${DIRPRFX}${entry}/ )
.endforIn the above, _SUBDIRUSE is now
a macro which will execute the given commands when it is
listed as a dependency.What sets this macro apart from other targets?
Basically, it is executed after the
instructions given in the build procedure it is listed as a
dependency to, and it does not adjust
.TARGET, which is the variable which
contains the name of the target currently being
built.clean: _SUBDIRUSE
rm -f ${CLEANFILES}In the above, clean will use
the _SUBDIRUSE macro after it has
executed the instruction
rm -f ${CLEANFILES}. In effect, this
causes clean to go further and
further down the directory tree, deleting built files as it
goes down, not on the way back
up.
-
+ Provided Targetsinstall and
package both go down the
directory tree calling the real versions of themselves
in the subdirectories
(realinstall and
realpackage
respectively).clean removes files
created by the build process (and goes down the
directory tree too).
cleandir does the same, and
also removes the object directory, if any.
-
+ More on Conditionalsexists is another condition
function which returns true if the given file
exists.empty returns true if the given
variable is empty.target returns true if the given
target does not already exist.
-
+ Looping Constructs in make
(.for).for provides a way to repeat a set
of instructions for each space-separated element in a
variable. It does this by assigning a variable to contain
the current element in the list being examined._SUBDIRUSE: .USE
.for entry in ${SUBDIR}
@${ECHO} "===> ${DIRPRFX}${entry}"
@(cd ${.CURDIR}/${entry} && \
${MAKE} ${.TARGET:S/realpackage/package/:S/realinstall/install/} DIRPRFX=${DIRPRFX}${entry}/ )
.endforIn the above, if SUBDIR is empty, no
action is taken; if it has one or more elements, the
instructions between .for and
.endfor would repeat for every element,
with entry being replaced with the value
of the current element.
Index: head/en_US.ISO8859-1/books/fdp-primer/structure/chapter.xml
===================================================================
--- head/en_US.ISO8859-1/books/fdp-primer/structure/chapter.xml (revision 51562)
+++ head/en_US.ISO8859-1/books/fdp-primer/structure/chapter.xml (revision 51563)
@@ -1,305 +1,306 @@
Documentation Directory StructureFiles and directories in the
doc/ tree follow a
structure meant to:Make it easy to automate converting the document to other
formats.Promote consistency between the different documentation
organizations, to make it easier to switch between working on
different documents.Make it easy to decide where in the tree new documentation
should be placed.In addition, the documentation tree must accommodate
documents in many different languages and encodings. It is
important that the documentation tree structure does not enforce
any particular defaults or cultural preferences.The Top Level,
doc/There are two types of directory under
doc/, each with very
specific directory names and meanings.DirectoryUsageshareContains files that are not specific to the various
translations and encodings of the documentation.
Contains subdirectories to further categorize the
information. For example, the files that comprise the
&man.make.1; infrastructure are in
share/mk, while
the additional XML support files
(such as the &os; extended DocBook
DTD) are in share/xml.lang.encodingOne directory exists for each available translation
and encoding of the documentation, for example
en_US.ISO8859-1/
and zh_TW.UTF-8/.
The names are long, but by fully specifying the language
and encoding we prevent any future headaches when a
translation team wants to provide documentation in the
same language but in more than one encoding. This also
avoids problems that might be caused by a future switch
to Unicode.The
lang.encoding/
DirectoriesThese directories contain the documents themselves. The
documentation is split into up to three more categories at
this level, indicated by the different directory names.DirectoryUsagearticlesDocumentation marked up as a DocBook
article (or equivalent). Reasonably
short, and broken up into sections. Normally only
available as one XHTML file.booksDocumentation marked up as a DocBook
book (or equivalent). Book length,
and broken up into chapters. Normally available as both
one large XHTML file (for people with
fast connections, or who want to print it easily from a
browser) and as a collection of linked, smaller
files.manFor translations of the system manual pages. This
directory will contain one or more
manN
directories, corresponding to the sections that have
been translated.Not every lang.encoding
directory will have all of these subdirectories. It depends
on how much translation has been accomplished by that
translation team.Document-Specific InformationThis section contains specific notes about particular
documents managed by the FDP.
-
+ The Handbookbooks/handbook/The Handbook is written in DocBook XML
using the &os; DocBook extended DTD.The Handbook is organized as a DocBook
book. The book is divided into
parts, each of which contains several
chapters. chapters are
further subdivided into sections (sect1)
and subsections (sect2,
sect3) and so on.
-
+ Physical OrganizationThere are a number of files and directories within the
handbook directory.The Handbook's organization may change over time, and
this document may lag in detailing the organizational
changes. Post questions about Handbook organization to the
&a.doc;.
-
+ MakefileThe Makefile defines some
variables that affect how the XML
source is converted to other formats, and lists the
various source files that make up the Handbook. It then
includes the standard doc.project.mk,
to bring in the rest of the code that handles converting
documents from one format to another.
-
+ book.xmlThis is the top level document in the Handbook. It
contains the Handbook's DOCTYPE
declaration, as well as the elements that
describe the Handbook's structure.book.xml uses parameter
entities to load in the files with the
.ent extension. These files
(described later) then define general
entities that are used throughout the rest of the
Handbook.
-
+ directory/chapter.xmlEach chapter in the Handbook is stored in a file
called chapter.xml in a separate
directory from the other chapters. Each directory is
named after the value of the id
attribute on the chapter
element.For example, if one of the chapter files
contains:chapter id="kernelconfig"
...
chapterThen it will be called
chapter.xml in the
kernelconfig directory. In general,
the entire contents of the chapter are in this one
file.When the XHTML version of the
Handbook is produced, this will yield
kernelconfig.html. This is because
of the id value, and is not related to
the name of the directory.In earlier versions of the Handbook, the files were
stored in the same directory as
book.xml, and named after the value
of the id attribute on the file's
chapter element. Now, it is possible
to include images in each chapter. Images for each
Handbook chapter are stored within share/images/books/handbook.
The localized version of these images should be
placed in the same directory as the XML
sources for each chapter. Namespace collisions are
inevitable, and it is easier to work with several
directories with a few files in them than it is to work
with one directory that has many files in it.A brief look will show that there are many directories
with individual chapter.xml files,
including basics/chapter.xml,
introduction/chapter.xml, and
printing/chapter.xml.Do not name chapters or directories after
their ordering within the Handbook. This ordering can
change as the content within the Handbook is
reorganized. Reorganization should be possible without
renaming files, unless entire chapters are being
promoted or demoted within the hierarchy.The chapter.xml files are not
complete XML documents that can be
built individually. They can only be built
as parts of the whole Handbook.
Index: head/en_US.ISO8859-1/books/fdp-primer/stylesheets/chapter.xml
===================================================================
--- head/en_US.ISO8859-1/books/fdp-primer/stylesheets/chapter.xml (revision 51562)
+++ head/en_US.ISO8859-1/books/fdp-primer/stylesheets/chapter.xml (revision 51563)
@@ -1,74 +1,74 @@
Style SheetsXML is concerned with content, and says
nothing about how that content should be presented to the reader
or rendered on paper. Multiple style sheet
languages have been developed to describe visual layout, including
Extensible Stylesheet Language Transformation
(XSLT), Document Style Semantics and
Specification Language (DSSSL), and Cascading
Style Sheets (CSS).The FDP documents use
XSLT stylesheets to transform DocBook into
XHTML, and then CSS
formatting is applied to the XHTML pages.
Printable output is currently rendered with legacy
DSSSL stylesheets, but this will probably
change in the future.CSSCascading Style Sheets (CSS) are a
mechanism for attaching style information (font, weight, size,
color, and so forth) to elements in an XHTML
document without abusing XHTML to do
so.
-
+ The DocBook DocumentsThe &os; XSLT and
DSSSL stylesheets refer to
docbook.css, which is expected to be
present in the same directory as the XHTML
files. The project-wide CSS file is copied
from doc/share/misc/docbook.css when
documents are converted to XHTML, and is
installed automatically.
Index: head/en_US.ISO8859-1/books/fdp-primer/tools/chapter.xml
===================================================================
--- head/en_US.ISO8859-1/books/fdp-primer/tools/chapter.xml (revision 51562)
+++ head/en_US.ISO8859-1/books/fdp-primer/tools/chapter.xml (revision 51563)
@@ -1,140 +1,140 @@
ToolsSeveral software tools are used to manage the FreeBSD
documentation and render it to different output formats. Some of
these tools are required and must be installed before working
through the examples in the following chapters. Some are
optional, adding capabilities or making the job of creating
documentation less demanding.Required ToolsInstall
textproc/docproj from the
Ports Collection. This meta-port installs
all the applications required to do useful work with the &os;
documentation. Some further notes on particular components are
given below.
-
+ DTDs and
Entities&os; documentation uses several Document Type Definitions
(DTDs) and sets of XML
entities. These are all installed by the
textproc/docproj
port.XHTML DTD
(textproc/xhtml)XHTML is the markup language of
choice for the World Wide Web, and is used throughout
the &os; web site.DocBook DTD (textproc/docbook-xml)DocBook is designed for marking up technical
documentation. Most of the &os; documentation is
written in DocBook.ISO 8879 entities
(textproc/iso8879)Character entities from the ISO 8879:1986 standard
used by many DTDs. Includes named
mathematical symbols, additional characters in the Latin
character set (accents, diacriticals, and so on), and
Greek symbols.Optional ToolsThese applications are not required, but can make working on
the documentation easier or add capabilities.
-
+ SoftwareVim
(editors/vim)A popular editor for working with
XML and derived documents, like
DocBook XML.Emacs or
XEmacs
(editors/emacs or
editors/xemacs)Both of these editors include a special mode for
editing documents marked up according to an
XML DTD. This
mode includes commands to reduce the amount of typing
needed, and help reduce the possibility of
errors.
Index: head/en_US.ISO8859-1/books/fdp-primer/writing-style/chapter.xml
===================================================================
--- head/en_US.ISO8859-1/books/fdp-primer/writing-style/chapter.xml (revision 51562)
+++ head/en_US.ISO8859-1/books/fdp-primer/writing-style/chapter.xml (revision 51563)
@@ -1,598 +1,598 @@
Writing StyleTipsTechnical documentation can be improved by consistent use of
several principles. Most of these can be classified into three
goals: be clear,
be complete, and
be concise. These goals can conflict with
each other. Good writing consists of a balance between
them.Be ClearClarity is extremely important. The reader may be a
novice, or reading the document in a second language. Strive
for simple, uncomplicated text that clearly explains the
concepts.Avoid flowery or embellished speech, jokes, or colloquial
expressions. Write as simply and clearly as possible. Simple
text is easier to understand and translate.Keep explanations as short, simple, and clear as possible.
Avoid empty phrases like in order to, which
usually just means to. Avoid potentially
patronizing words like basically. Avoid Latin
terms like i.e. or cf., which
may be unknown outside of academic or scientific
groups.Write in a formal style. Avoid addressing the reader
as you. For example, say
copy the file to /tmp
rather than you can copy the file to
/tmp.Give clear, correct, tested examples.
A trivial example is better than no example. A good example
is better yet. Do not give bad examples, identifiable by
apologies or sentences like but really it should never
be done that way. Bad examples are worse than no
examples. Give good examples, because even when
warned not to use the example as shown, the
reader will usually just use the example as shown.Avoid weasel words like
should, might,
try, or could. These words
imply that the speaker is unsure of the facts, and
create doubt in the reader.Similarly, give instructions as imperative commands: not
you should do this, but merely
do this.Be CompleteDo not make assumptions about the reader's abilities or
skill level. Tell them what they need to know. Give links to
other documents to provide background information without
having to recreate it. Put yourself in the reader's place,
anticipate the questions they will ask, and answer
them.Be ConciseWhile features should be documented completely, sometimes
there is so much information that the reader cannot easily
find the specific detail needed. The balance between being
complete and being concise is a challenge. One approach is to
have an introduction, then a quick start
section that describes the most common situation, followed by
an in-depth reference section.GuidelinesTo promote consistency between the myriad authors of the
&os; documentation, some guidelines have been drawn up for
authors to follow.Use American English SpellingThere are several variants of English, with different
spellings for the same word. Where spellings differ, use
the American English variant. color, not
colour, rationalize, not
rationalise, and so on.The use of British English may be accepted in the
case of a contributed article, however the spelling must
be consistent within the whole document. The other
documents such as books, web site, manual pages, etc.
will have to use American English.Do not use contractionsDo not use contractions. Always spell the phrase out
in full. Don't use contractions is
wrong.Avoiding contractions makes for a more formal tone, is
more precise, and is slightly easier for
translators.Use the serial commaIn a list of items within a paragraph, separate each
item from the others with a comma. Separate the last item
from the others with a comma and the word
and.For example:
This is a list of one, two and three items.
Is this a list of three items, one,
two, and three, or a list of
two items, one and two and
three?It is better to be explicit and include a serial
comma:
This is a list of one, two, and three items.
Avoid redundant phrasesDo not use redundant phrases. In particular,
the command, the file, and
man command are often redundant.For example, commands:Wrong: Use the svn command to
update sources.Right: Use svn to update
sources.Filenames:Wrong: … in the filename
/etc/rc.local…Right: … in
/etc/rc.local…Manual page references (the second example uses
citerefentry with the
&man.csh.1; entity):.Wrong: See man csh for more
information.Right: See &man.csh.1;.Two spaces between sentencesAlways use two spaces between sentences, as it
improves readability and eases use of tools such as
Emacs.A period and spaces followed by a capital letter
does not always mark a new sentence, especially in names.
Jordan K. Hubbard is a good example. It
has a capital H following a period and
a space, and is certainly not a new sentence.For more information about writing style, see Elements of
Style, by William Strunk.Style GuideTo keep the source for the documentation consistent when
many different people are editing it, please follow these style
conventions.
-
+ Letter CaseTags are entered in lower case, para,
notPARA.Text that appears in SGML contexts is generally written in
upper case, <!ENTITY…>, and
<!DOCTYPE…>,
not<!entity…> and
<!doctype…>.AcronymsAcronyms should be defined the first time they appear in a
document, as in:
Network Time Protocol (NTP).
After the acronym has been defined, use the acronym alone
unless it makes more sense contextually to use the whole term.
Acronyms are usually defined only once per chapter or per
document.All acronyms should be enclosed in
acronym tags.IndentationThe first line in each file starts with no indentation,
regardless of the indentation level of
the file which might contain the current file.Opening tags increase the indentation level by two spaces.
Closing tags decrease the indentation level by two spaces.
Blocks of eight spaces at the start of a line should be
replaced with a tab. Do not use spaces in front of tabs, and
do not add extraneous whitespace at the end of a line.
Content within elements should be indented by two spaces if
the content runs over more than one line.For example, the source for this section looks like
this:chaptertitle...titlesect1title...titlesect2titleIndentationtitleparaThe first line in each file starts with no indentation,
emphasisregardlessemphasis of the indentation level of
the file which might contain the current file.para
...
sect2sect1chapterTags containing long attributes follow the same
rules. Following the indentation rules in this case helps
editors and writers see which content is inside the
tags:paraSee the link
linkend="gmirror-troubleshooting"Troubleshootinglink
section if there are problems booting. Powering down and
disconnecting the original filenameada0filename disk
will allow it to be kept as an offline backup.paraparaIt is also possible to journal the boot disk of a &os;
system. Refer to the article link
xlink:href="&url.articles.gjournal-desktop;"Implementing UFS
Journaling on a Desktop PClink for detailed
instructions.paraWhen an element is too long to fit on the remainder of a
line without wrapping, moving the start tag to the next line
can make the source easier to read. In this example, the
systemitem element has been moved to the
next line to avoid wrapping and indenting:paraWith file flags, even
systemitem class="username"rootsystemitem can be
prevented from removing or altering files.paraConfigurations to help various text editors conform to
these guidelines can be found in
.Tag StyleTag SpacingTags that start at the same indent as a previous tag
should be separated by a blank line, and those that are not
at the same indent as a previous tag should not:article lang='en'articleinfotitleNIStitlepubdateOctober 1999pubdateabstractpara...
...
...paraabstractarticleinfosect1title...titlepara...parasect1sect1title...titlepara...parasect1articleSeparating TagsTags like itemizedlist which will
always have further tags inside them, and in fact do not
take character data themselves, are always on a line by
themselves.Tags like para and
term do not need other tags to contain
normal character data, and their contents begin immediately
after the tag, on the same line.The same applies to when these two types of tags
close.This leads to an obvious problem when mixing these
tags.When a starting tag which cannot contain character data
directly follows a tag of the type that requires other tags
within it to use character data, they are on separate lines.
The second tag should be properly indented.When a tag which can contain character data closes
directly after a tag which cannot contain character data
closes, they co-exist on the same line.Whitespace ChangesDo not commit changes
to content at the same time as changes to
formatting.When content and whitespace changes are kept separate,
translation teams can easily see whether a change was content
that must be translated or only whitespace.For example, if two sentences have been added to a
paragraph so that the line lengths now go
over 80 columns, first commit the change with the too-long
lines. Then fix the line wrapping, and commit this
second change. In the commit message for the second change,
indicate that this is a whitespace-only change that can be
ignored by translators.Non-Breaking SpaceAvoid line breaks in places where they look ugly or make
it difficult to follow a sentence. Line breaks depend on the
width of the chosen output medium. In particular, viewing the
HTML documentation with a text browser can lead to badly
formatted paragraphs like the next one:Data capacity ranges from 40 MB to 15
GB. Hardware compression …The general entity prohibits
line breaks between parts belonging together. Use
non-breaking spaces in the following places:between numbers and units:57600 bpsbetween program names and version numbers:&os; 9.2between multiword names (use with caution when
applying this to more than 3-4 word names like
The &os; Brazilian Portuguese Documentation
Project):Word ListThis list of words shows the correct spelling and
capitalization when used in &os; documentation. If a word is
not on this list, ask about it on the &a.doc;.WordXML CodeNotesCD-ROMacronymCD-ROMacronymDoS (Denial of Service)acronymDoSacronymemailfile systemIPsecInternetmanual pagemail servername serverPorts Collectionread-onlySoft UpdatesstdinvarnamestdinvarnamestdoutvarnamestdoutvarnamestderrvarnamestderrvarnameSubversionapplicationSubversionapplicationDo not refer to the Subversion application as
SVN in upper case. To refer to the
command, use commandsvncommand.&unix;&unix;userlandthings that apply to user space, not the
kernelweb server
Index: head/en_US.ISO8859-1/books/fdp-primer/xml-primer/chapter.xml
===================================================================
--- head/en_US.ISO8859-1/books/fdp-primer/xml-primer/chapter.xml (revision 51562)
+++ head/en_US.ISO8859-1/books/fdp-primer/xml-primer/chapter.xml (revision 51563)
@@ -1,1415 +1,1415 @@
XML PrimerMost FDP documentation is written with
markup languages based on XML. This chapter
explains what that means, how to read and understand the
documentation source, and the XML techniques
used.Portions of this section were inspired by Mark Galassi's
Get
Going With DocBook.OverviewIn the original days of computers, electronic text was
simple. There were a few character sets like
ASCII or EBCDIC, but that
was about it. Text was text, and what you saw really was what
you got. No frills, no formatting, no intelligence.Inevitably, this was not enough. When text is in a
machine-usable format, machines are expected to be able to use
and manipulate it intelligently. Authors want to indicate that
certain phrases should be emphasized, or added to a glossary, or
made into hyperlinks. Filenames could be shown in a
typewriter style font for viewing on screen, but
as italics when printed, or any of a myriad of
other options for presentation.It was once hoped that Artificial Intelligence (AI) would
make this easy. The computer would read the document and
automatically identify key phrases, filenames, text that the
reader should type in, examples, and more. Unfortunately, real
life has not happened quite like that, and computers still
require assistance before they can meaningfully process
text.More precisely, they need help identifying what is what.
Consider this text:
To remove /tmp/foo, use
&man.rm.1;.&prompt.user; rm /tmp/foo
It is easy to see which parts are filenames, which are
commands to be typed in, which parts are references to manual
pages, and so on. But the computer processing the document
cannot. For this we need markup.Markup is commonly used to describe
adding value or increasing cost.
The term takes on both these meanings when applied to text.
Markup is additional text included in the document,
distinguished from the document's content in some way, so that
programs that process the document can read the markup and use
it when making decisions about the document. Editors can hide
the markup from the user, so the user is not distracted by
it.The extra information stored in the markup
adds value to the document. Adding the
markup to the document must typically be done by a
person—after all, if computers could recognize the text
sufficiently well to add the markup then there would be no need
to add it in the first place. This
increases the cost (the effort required) to
create the document.The previous example is actually represented in this
document like this:paraTo remove filename/tmp/foofilename, use &man.rm.1;.parascreen&prompt.user; userinputrm /tmp/foouserinputscreenThe markup is clearly separate from the content.Markup languages define what the markup means and how it
should be interpreted.Of course, one markup language might not be enough. A
markup language for technical documentation has very different
requirements than a markup language that is intended for cookery
recipes. This, in turn, would be very different from a markup
language used to describe poetry. What is really needed is a
first language used to write these other markup languages. A
meta markup language.This is exactly what the eXtensible Markup
Language (XML) is. Many markup languages
have been written in XML, including the two
most used by the FDP,
XHTML and DocBook.Each language definition is more properly called a grammar,
vocabulary, schema or Document Type Definition
(DTD). There are various languages to
specify an XML grammar, or
schema.A schema is a
complete specification of all the elements
that are allowed to appear, the order in which they should
appear, which elements are mandatory, which are optional, and so
forth. This makes it possible to write an
XML parser which reads
in both the schema and a document which claims to conform to the
schema. The parser can then confirm whether or not all the
elements required by the vocabulary are in the document in the
right order, and whether there are any errors in the markup.
This is normally referred to as
validating the document.Validation confirms that the choice of
elements, their ordering, and so on, conforms to that listed
in the grammar. It does not check
whether appropriate markup has been used
for the content. If all the filenames in a document were
marked up as function names, the parser would not flag this as
an error (assuming, of course, that the schema defines
elements for filenames and functions, and that they are
allowed to appear in the same place).Most contributions to the Documentation
Project will be content marked up in either
XHTML or DocBook, rather than alterations to
the schemas. For this reason, this book will not touch on how
to write a vocabulary.Elements, Tags, and AttributesAll the vocabularies written in XML share
certain characteristics. This is hardly surprising, as the
philosophy behind XML will inevitably show
through. One of the most obvious manifestations of this
philosophy is that of content and
elements.Documentation, whether it is a single web page, or a lengthy
book, is considered to consist of content. This content is then
divided and further subdivided into elements. The purpose of
adding markup is to name and identify the boundaries of these
elements for further processing.For example, consider a typical book. At the very top
level, the book is itself an element. This book
element obviously contains chapters, which can be considered to
be elements in their own right. Each chapter will contain more
elements, such as paragraphs, quotations, and footnotes. Each
paragraph might contain further elements, identifying content
that was direct speech, or the name of a character in the
story.It may be helpful to think of this as
chunking content. At the very top level is one
chunk, the book. Look a little deeper, and there are more
chunks, the individual chapters. These are chunked further into
paragraphs, footnotes, character names, and so on.Notice how this differentiation between different elements
of the content can be made without resorting to any
XML terms. It really is surprisingly
straightforward. This could be done with a highlighter pen and
a printout of the book, using different colors to indicate
different chunks of content.Of course, we do not have an electronic highlighter pen, so
we need some other way of indicating which element each piece of
content belongs to. In languages written in
XML (XHTML, DocBook, et
al) this is done by means of tags.A tag is used to identify where a particular element starts,
and where the element ends. The tag is not part of
the element itself. Because each grammar was
normally written to mark up specific types of information, each
one will recognize different elements, and will therefore have
different names for the tags.For an element called
element-name the start tag will
normally look like element-name.
The corresponding closing tag for this element is element-name.Using an Element (Start and End Tags)XHTML has an element for indicating
that the content enclosed by the element is a paragraph,
called p.pThis is a paragraph. It starts with the start tag for
the 'p' element, and it will end with the end tag for the 'p'
element.ppThis is another paragraph. But this one is much shorter.pSome elements have no content. For example, in
XHTML, a horizontal line can be included in
the document. For these empty elements,
XML introduced a shorthand form that is
completely equivalent to the two-tag version:Using an Element Without ContentXHTML has an element for indicating a
horizontal rule, called hr. This element
does not wrap content, so it looks like this:pOne paragraph.phrhrpThis is another paragraph. A horizontal rule separates this
from the previous paragraph.pThe shorthand version consists of a single tag:pOne paragraph.phrpThis is another paragraph. A horizontal rule separates this
from the previous paragraph.pAs shown above, elements can contain other elements. In the
book example earlier, the book element contained all the chapter
elements, which in turn contained all the paragraph elements,
and so on.Elements Within Elements; empThis is a simple emparagraphem where some
of the emwordsem have been ememphasizedem.pThe grammar consists of rules that describe which elements
can contain other elements, and exactly what they can
contain.People often confuse the terms tags and elements, and use
the terms as if they were interchangeable. They are
not.An element is a conceptual part of your document. An
element has a defined start and end. The tags mark where the
element starts and ends.When this document (or anyone else knowledgeable about
XML) refers to
the p tag
they mean the literal text consisting of the three characters
<, p, and
>. But the phrase
the p element refers to the
whole element.This distinction is very subtle. But
keep it in mind.Elements can have attributes. An attribute has a name and a
value, and is used for adding extra information to the element.
This might be information that indicates how the content should
be rendered, or might be something that uniquely identifies that
occurrence of the element, or it might be something else.An element's attributes are written
inside the start tag for that element, and
take the form
attribute-name="attribute-value".In XHTML, the p
element has an attribute called
align, which suggests an
alignment (justification) for the paragraph to the program
displaying the XHTML.The align attribute can
take one of four defined values, left,
center, right and
justify. If the attribute is not specified
then the default is left.Using an Element with an Attributep align="left"The inclusion of the align attribute
on this paragraph was superfluous, since the default is left.pp align="center"This may appear in the center.pSome attributes only take specific values, such as
left or justify. Others
allow any value.Single Quotes Around Attributesp align='right'I am on the right!pAttribute values in XML must be enclosed
in either single or double quotes. Double quotes are
traditional. Single quotes are useful when the attribute value
contains double quotes.Information about attributes, elements, and tags is stored
in catalog files. The Documentation Project uses standard
DocBook catalogs and includes additional catalogs for
&os;-specific features. Paths to the catalog files are defined
in an environment variable so they can be found by the document
build tools.
-
+ To Do…Before running the examples in this document, install
textproc/docproj from
the &os; Ports Collection. This is a
meta-port that downloads and installs
the standard programs and supporting files needed by the
Documentation Project. &man.csh.1; users must use
rehash for the shell to recognize new
programs after they have been installed, or log out
and then log back in again.Create example.xml, and enter
this text:!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"html xmlns="http://www.w3.org/1999/xhtml"headtitleAn Example XHTML FiletitleheadbodypThis is a paragraph containing some text.ppThis paragraph contains some more text.pp align="right"This paragraph might be right-justified.pbodyhtmlTry to validate this file using an
XML parser.textproc/docproj
includes the xmllint
validating
parser.Use xmllint to validate the
document:&prompt.user; xmllint --valid --noout example.xmlxmllint returns without displaying
any output, showing that the document validated
successfully.See what happens when required elements are omitted.
Delete the line with the
title and
title tags, and re-run
the validation.&prompt.user; xmllint --valid --noout example.xml
example.xml:5: element head: validity error : Element head content does not follow the DTD, expecting ((script | style | meta | link | object | isindex)* , ((title , (script | style | meta | link | object | isindex)* , (base , (script | style | meta | link | object | isindex)*)?) | (base , (script | style | meta | link | object | isindex)* , title , (script | style | meta | link | object | isindex)*))), got ()This shows that the validation error comes from the
fifth line of the
example.xml file and that the
content of the head is
the part which does not follow the rules of the
XHTML grammar.Then xmllint shows the line where
the error was found and marks the exact character position
with a ^ sign.Replace the title element.The DOCTYPE DeclarationThe beginning of each document can specify the name of the
DTD to which the document conforms. This
DOCTYPE declaration is used by XML parsers to
identify the DTD and ensure that the document
does conform to it.A typical declaration for a document written to conform with
version 1.0 of the XHTML
DTD looks like this:!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"That line contains a number of different components.<!The indicator shows
this is an XML declaration.DOCTYPEShows that this is an XML
declaration of the document type.htmlNames the first
element that
will appear in the document.PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"Lists the Formal Public Identifier
(FPI)
Formal Public Identifier
for the DTD to which this document
conforms. The XML parser uses this to
find the correct DTD when processing
this document.PUBLIC is not a part of the
FPI, but indicates to the
XML processor how to find the
DTD referenced in the
FPI. Other ways of telling the
XML parser how to find the
DTD are shown later."http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"A local filename or a URL to find
the DTD.>Ends the declaration and returns to the
document.
-
+ Formal Public Identifiers
(FPIs)Formal Public IdentifierIt is not necessary to know this, but it is useful
background, and might help debug problems when the
XML processor can not locate the
DTD.FPIs must follow a specific
syntax:"Owner//KeywordDescription//Language"OwnerThe owner of the FPI.The beginning of the string identifies the owner of
the FPI. For example, the
FPI
"ISO 8879:1986//ENTITIES Greek
Symbols//EN" lists
ISO 8879:1986 as being the owner for
the set of entities for Greek symbols.
ISO 8879:1986 is the International
Organization for Standardization
(ISO) number for the
SGML standard, the predecessor (and a
superset) of XML.Otherwise, this string will either look like
-//Owner
or
+//Owner
(notice the only difference is the leading
+ or -).If the string starts with - then
the owner information is unregistered, with a
+ identifying it as
registered.ISO 9070:1991 defines how
registered names are generated. It might be derived
from the number of an ISO
publication, an ISBN code, or an
organization code assigned according to
ISO 6523. Additionally, a
registration authority could be created in order to
assign registered names. The ISO
council delegated this to the American National
Standards Institute (ANSI).Because the &os; Project has not been registered,
the owner string is -//&os;. As seen
in the example, the W3C are not a
registered owner either.KeywordThere are several keywords that indicate the type of
information in the file. Some of the most common
keywords are DTD,
ELEMENT, ENTITIES,
and TEXT. DTD is
used only for DTD files,
ELEMENT is usually used for
DTD fragments that contain only
entity or element declarations. TEXT
is used for XML content (text and
tags).DescriptionAny description can be given for the contents
of this file. This may include version numbers or any
short text that is meaningful and unique for the
XML system.LanguageAn ISO two-character code that
identifies the native language for the file.
EN is used for English.
-
+ catalog FilesWith the syntax above, an XML
processor needs to have some way of turning the
FPI into the name of the file containing
the DTD. A catalog file (typically
called catalog) contains lines that map
FPIs to filenames. For example, if the
catalog file contained the line:PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "1.0/transitional.dtd"The XML processor knows that the
DTD is called
transitional.dtd in the
1.0 subdirectory of the directory that
held catalog.Examine the contents of
/usr/local/share/xml/dtd/xhtml/catalog.xml.
This is the catalog file for the XHTML
DTDs that were installed as part of the
textproc/docproj port.Alternatives to FPIsInstead of using an FPI to indicate the
DTD to which the document conforms (and
therefore, which file on the system contains the
DTD), the filename can be explicitly
specified.The syntax is slightly different:!DOCTYPE html SYSTEM "/path/to/file.dtd"The SYSTEM keyword indicates that the
XML processor should locate the
DTD in a system specific fashion. This
typically (but not always) means the DTD
will be provided as a filename.Using FPIs is preferred for reasons of
portability. If the SYSTEM identifier is
used, then the DTD must be provided and
kept in the same location for everyone.Escaping Back to XMLSome of the underlying XML syntax can be
useful within documents. For example, comments can be included
in the document, and will be ignored by the parser. Comments
are entered using XML syntax. Other uses for
XML syntax will be shown later.XML sections begin with a
<! tag and end with a
>. These sections contain instructions
for the parser rather than elements of the document. Everything
between these tags is XML syntax. The
DOCTYPE
declaration shown earlier is an example of
XML syntax included in the document.CommentsAn XML document may contain comments.
They may appear anywhere as long as they are not inside tags.
They are even allowed in some locations inside the
DTD (e.g., between entity
declarations).XML comments start with the string
<!-- and end with the
string -->.Here are some examples of valid XML
comments:XML Generic Comments<!-- This is inside the comment -->
<!--This is another comment-->
<!-- This is how you
write multiline comments -->
<p>A simple <!-- Comment inside an element's content --> paragraph.</p>XML comments may contain any strings
except --:Erroneous XML Comment<!-- This comment--is wrong -->
-
+ To Do…Add some comments to
example.xml, and check that the file
still validates using xmllint.Add some invalid comments to
example.xml, and see the error
messages that xmllint gives when it
encounters an invalid comment.EntitiesEntities are a mechanism for assigning names to chunks of
content. As an XML parser processes a
document, any entities it finds are replaced by the content of
the entity.This is a good way to have re-usable, easily changeable
chunks of content in XML documents. It is
also the only way to include one marked up file inside another
using XML.There are two types of entities for two different
situations: general entities and
parameter entities.General EntitiesGeneral entities are used to assign names to reusable
chunks of text. These entities can only be used in the
document. They cannot be used in an
XML context.To include the text of a general entity in the document,
include
&entity-name;
in the text. For example, consider a general entity called
current.version which expands to the
current version number of a product. To use it in the
document, write:paraThe current version of our product is
¤t.version;.paraWhen the version number changes, edit the definition of
the general entity, replacing the value. Then reprocess the
document.General entities can also be used to enter characters that
could not otherwise be included in an XML
document. For example, < and
& cannot normally appear in an
XML document. The XML
parser sees the < symbol as the start of
a tag. Likewise, when the & symbol is
seen, the next text is expected to be an entity name.These symbols can be included by using two predefined
general entities: < and
&.General entities can only be defined within an
XML context. Such definitions are usually
done immediately after the DOCTYPE declaration.Defining General Entities<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [
<!ENTITY current.version "3.0-RELEASE">
<!ENTITY last.version "2.2.7-RELEASE">
]>The DOCTYPE declaration has been extended by adding a
square bracket at the end of the first line. The two
entities are then defined over the next two lines, the
square bracket is closed, and then the DOCTYPE declaration
is closed.The square brackets are necessary to indicate that the
DTD indicated by the DOCTYPE declaration is being
extended.Parameter EntitiesParameter entities, like
general
entities, are used to assign names to reusable chunks
of text. But parameter entities can only be used within an
XML
context.Parameter entity definitions are similar to those for
general entities. However, parameter entries are included
with
%entity-name;.
The definition also includes the % between
the ENTITY keyword and the name of the
entity.For a mnemonic, think
Parameter entities use the
Percent symbol.Defining Parameter Entities<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [
<!ENTITY % param.some "some">
<!ENTITY % param.text "text">
<!ENTITY % param.new "%param.some more %param.text">
<!-- %param.new now contains "some more text" -->
]>
-
+ To Do…Add a general entity to
example.xml.<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [
<!ENTITY version "1.1">
]>
html xmlns="http://www.w3.org/1999/xhtml"headtitleAn Example XHTML Filetitlehead
<!-- There may be some comments in here as well -->
bodypThis is a paragraph containing some text.ppThis paragraph contains some more text.pp align="right"This paragraph might be right-justified.ppThe current version of this document is: &version;pbodyhtmlValidate the document using
xmllint.Load example.xml into a web
browser. It may have to be copied to
example.html before the browser
recognizes it as an XHTML
document.Older browsers with simple parsers may not render this
file as expected. The entity reference
&version; may not be replaced by
the version number, or the XML context
closing ]> may not be recognized and
instead shown in the output.The solution is to normalize the
document with an XML normalizer. The
normalizer reads valid XML and writes
equally valid XML which has been
transformed in some way. One way the normalizer
transforms the input is by expanding all the entity
references in the document, replacing the entities with
the text that they represent.xmllint can be used for this. It
also has an option to drop the initial
DTD section so that the closing
]> does not confuse browsers:&prompt.user; xmllint --noent --dropdtd example.xml > example.htmlA normalized copy of the document with entities
expanded is produced in example.html,
ready to load into a web browser.Using Entities to Include FilesBoth
general and
parameter
entities are particularly useful for including one file inside
another.Using General Entities to Include FilesConsider some content for an XML book
organized into files, one file per chapter, called
chapter1.xml,
chapter2.xml, and so forth, with a
book.xml that will contain these
chapters.In order to use the contents of these files as the values
for entities, they are declared with the
SYSTEM keyword. This directs the
XML parser to include the contents of the
named file as the value of the entity.Using General Entities to Include Files<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [
<!ENTITY chapter.1 SYSTEM "chapter1.xml">
<!ENTITY chapter.2 SYSTEM "chapter2.xml">
<!ENTITY chapter.3 SYSTEM "chapter3.xml">
<!-- And so forth -->
]>
html xmlns="http://www.w3.org/1999/xhtml"
<!-- Use the entities to load in the chapters -->
&chapter.1;
&chapter.2;
&chapter.3;
htmlWhen using general entities to include other files
within a document, the files being included
(chapter1.xml,
chapter2.xml, and so on)
must not start with a DOCTYPE
declaration. This is a syntax error because entities are
low-level constructs and they are resolved before any
parsing happens.
-
+ Using Parameter Entities to Include FilesParameter entities can only be used inside an
XML context. Including a file in an
XML context can be used
to ensure that general entities are reusable.Suppose that there are many chapters in the document, and
these chapters were reused in two different books, each book
organizing the chapters in a different fashion.The entities could be listed at the top of each book, but
that quickly becomes cumbersome to manage.Instead, place the general entity definitions inside one
file, and use a parameter entity to include that file within
the document.Using Parameter Entities to Include FilesPlace the entity definitions in a separate file
called chapters.ent and
containing this text:<!ENTITY chapter.1 SYSTEM "chapter1.xml">
<!ENTITY chapter.2 SYSTEM "chapter2.xml">
<!ENTITY chapter.3 SYSTEM "chapter3.xml">Create a parameter entity to refer to the contents
of the file. Then use the parameter entity to load the file
into the document, which will then make all the general
entities available for use. Then use the general entities
as before:<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [
<!-- Define a parameter entity to load in the chapter general entities -->
<!ENTITY % chapters SYSTEM "chapters.ent">
<!-- Now use the parameter entity to load in this file -->
%chapters;
]>
html xmlns="http://www.w3.org/1999/xhtml"
&chapter.1;
&chapter.2;
&chapter.3;
html
-
+ To Do…
-
+ Use General Entities to Include FilesCreate three files, para1.xml,
para2.xml, and
para3.xml.Put content like this in each file:pThis is the first paragraph.pEdit example.xml so that it
looks like this:<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [
<!ENTITY version "1.1">
<!ENTITY para1 SYSTEM "para1.xml">
<!ENTITY para2 SYSTEM "para2.xml">
<!ENTITY para3 SYSTEM "para3.xml">
]>
html xmlns="http://www.w3.org/1999/xhtml"headtitleAn Example XHTML FiletitleheadbodypThe current version of this document is: &version;p
¶1;
¶2;
¶3;
bodyhtmlProduce example.html by
normalizing example.xml.&prompt.user; xmllint --dropdtd --noent example.xml > example.htmlLoad example.html into the web
browser and confirm that the
paran.xml
files have been included in
example.html.
-
+ Use Parameter Entities to Include FilesThe previous steps must have completed before this
step.Edit example.xml so that it
looks like this:<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [
<!ENTITY % entities SYSTEM "entities.ent"> %entities;
]>
html xmlns="http://www.w3.org/1999/xhtml"headtitleAn Example XHTML FiletitleheadbodypThe current version of this document is: &version;p
¶1;
¶2;
¶3;
bodyhtmlCreate a new file called
entities.ent with this
content:<!ENTITY version "1.1">
<!ENTITY para1 SYSTEM "para1.xml">
<!ENTITY para2 SYSTEM "para2.xml">
<!ENTITY para3 SYSTEM "para3.xml">Produce example.html by
normalizing example.xml.&prompt.user; xmllint --dropdtd --noent example.xml > example.htmlLoad example.html into the web
browser and confirm that the
paran.xml
files have been included in
example.html.Marked SectionsXML provides a mechanism to indicate that
particular pieces of the document should be processed in a
special way. These are called
marked sections.Structure of a Marked Section<![KEYWORD[
Contents of marked section
]]>As expected of an XML construct, a marked
section starts with <!.The first square bracket begins the marked section.KEYWORD describes how this marked
section is to be processed by the parser.The second square bracket indicates the start of the
marked section's content.The marked section is finished by closing the two square
brackets, and then returning to the document context from the
XML context with
>.Marked Section KeywordsCDATAThese keywords denote the marked sections
content model, and allow you to change
it from the default.When an XML parser is processing a
document, it keeps track of the
content model.The content model describes the
content the parser is expecting to see and what it will do
with that content.The CDATA content model is one of the
most useful.CDATA is for
Character Data. When the parser is in this
content model, it expects to see only characters. In this
model the < and
& symbols lose their special status,
and will be treated as ordinary characters.When using CDATA in examples of
text marked up in XML, remember that
the content of CDATA is not validated.
The included text must be check with other means. For
example, the content could be written in another document,
validated, and then pasted into the
CDATA section.Using a CDATA Marked
SectionparaHere is an example of how to include some text that contains
many literal<literal and literal&literal
symbols. The sample text is a fragment of
acronymXHTMLacronym. The surrounding text (para and
programlisting) are from DocBook.paraprogramlisting<![CDATA[pThis is a sample that shows some of the
elements within acronymXHTMLacronym. Since the angle
brackets are used so many times, it is simpler to say the whole
example is a CDATA marked section than to use the entity names for
the left and right angle brackets throughout.pulliThis is a listitemliliThis is a second listitemliliThis is a third listitemliulpThis is the end of the example.p]]>programlistingINCLUDE and
IGNOREWhen the keyword is INCLUDE, then the
contents of the marked section will be processed. When the
keyword is IGNORE, the marked section
is ignored and will not be processed. It will not appear in
the output.Using INCLUDE and
IGNORE in Marked Sections<![INCLUDE[
This text will be processed and included.
]]>
<![IGNORE[
This text will not be processed or included.
]]>By itself, this is not too useful. Text to be
removed from the document could be cut out, or wrapped
in comments.It becomes more useful when controlled by
parameter
entities, yet this usage is limited
to entity files.For example, suppose that documentation was produced in
a hard-copy version and an electronic version. Some extra
text is desired in the electronic version content that was
not to appear in the hard-copy.Create an entity file that defines general entities to
include each chapter and guard these definitions with a
parameter entity that can be set to either
INCLUDE or IGNORE to
control whether the entity is defined. After these
conditional general entity definitions, place one more
definition for each general entity to set them to an empty
value. This technique makes use of the fact that entity
definitions cannot be overridden but the first definition
always takes effect. So the inclusion of the chapter is
controlled with the corresponding parameter entity. Set to
INCLUDE, the first general entity
definition will be read and the second one will be ignored.
Set to IGNORE, the first definition will
be ignored and the second one will take effect.Using a Parameter Entity to Control a Marked
Section<!ENTITY % electronic.copy "INCLUDE">
<![%electronic.copy;[
<!ENTITY chap.preface SYSTEM "preface.xml">
]]>
<!ENTITY chap.preface "">When producing the hard-copy version, change the
parameter entity's definition to:<!ENTITY % electronic.copy "IGNORE">
-
+ To Do…Modify entities.ent to
contain the following:<!ENTITY version "1.1">
<!ENTITY % conditional.text "IGNORE">
<![%conditional.text;[
<!ENTITY para1 SYSTEM "para1.xml">
]]>
<!ENTITY para1 "">
<!ENTITY para2 SYSTEM "para2.xml">
<!ENTITY para3 SYSTEM "para3.xml">Normalize example.xml
and notice that the conditional text is not present in the
output document. Set the parameter entity
guard to INCLUDE and regenerate the
normalized document and the text will appear again.
This method makes sense if there are more
conditional chunks depending on the same condition. For
example, to control generating printed or online
text.ConclusionThat is the conclusion of this XML
primer. For reasons of space and complexity, several things
have not been covered in depth (or at all). However, the
previous sections cover enough XML to
introduce the organization of the FDP
documentation.