XML (eXtensible Markup Language) is a data format
for structured document interchange on the Web. It is a
standard defined by The World Wide Web consortium
(W3C). Information about XML and related technologies
can be found at http://www.w3.org/XML/.
This PHP extension implements support for James
Clark's expat in PHP.
This toolkit lets you parse, but not validate, XML
documents. It supports three source character encodings
also provided by PHP:
US-ASCII, ISO-8859-1 and
UTF-8.
UTF-16 is not supported.
This extension lets you create XML
parsers and then define handlers for different XML
events. Each XML parser also has a few parameters
you can adjust.
This extension uses
expat, which can be found at
http://www.jclark.com/xml/. The Makefile that comes
with expat does not build a library by default, you can
use this make rule for that:
libexpat.a: $(OBJS) ar -rc $@ $(OBJS) ranlib $@ |
These functions are enabled by default, using the
bundled expat library. You can disable XML support with
--disable-xml. If you compile
PHP as a module for Apache 1.3.9 or later, PHP will
automatically use the bundled expat library from Apache. In
order you dont't want to use the bundled expat library
configure PHP
--with-expat-dir=DIR, where DIR should point to
the base installation directory of expat.
The windows version of PHP
has built in support for this extension. You do not
need to load any additional extension in order to use
these functions.
以下的常數由此延伸定義, 只在這個延伸被編譯成PHP或實行時期被動態載入時有效。
The XML event handlers defined are:
表格 1. Supported XML handlers
The element handler functions may get their
element names case-folded.
Case-folding is defined by the XML standard as "a
process applied to a sequence of characters, in which
those identified as non-uppercase are replaced by their
uppercase equivalents". In other words, when it comes
to XML, case-folding simply means uppercasing.
By default, all the element names that are passed
to the handler functions are case-folded. This
behaviour can be queried and controlled per XML parser
with the
xml_parser_get_option() and
xml_parser_set_option() functions,
respectively.
The following constants are defined for XML error
codes (as returned by
xml_parse()):
PHP's XML extension supports the Unicode
character set through different
character encodings. There are two types of
character encodings, source
encoding and target
encoding. PHP's internal representation of the
document is always encoded with
UTF-8.
Source encoding is done when an XML document is parsed. Upon creating an XML
parser, a source encoding can be specified (this
encoding can not be changed later in the XML parser's
lifetime). The supported source encodings are ISO-8859-1,
US-ASCII and UTF-8. The
former two are single-byte encodings, which means that
each character is represented by a single byte. UTF-8 can encode characters
composed by a variable number of bits (up to 21) in one
to four bytes. The default source encoding used by PHP
is ISO-8859-1.
Target encoding is done when PHP passes data to
XML handler functions. When an XML parser is created,
the target encoding is set to the same as the source
encoding, but this may be changed at any point. The
target encoding will affect character data as well as
tag names and processing instruction targets.
If the XML parser encounters characters outside
the range that its source encoding is capable of
representing, it will return an error.
If PHP encounters characters in the parsed XML
document that can not be represented in the chosen
target encoding, the problem characters will be
"demoted". Currently, this means that such characters
are replaced by a question mark.
Here are some example PHP scripts parsing XML
documents.
This first example displays the stucture of the
start elements in a document with indentation.
|
This example highlights XML code. It illustrates
how to use an external entity reference handler to
include and parse other documents, as well as how PIs
can be processed, and a way of determining "trust"
for PIs containing code.
XML documents that can be used for this example
are found below the example (xmltest.xml and
xmltest2.xml.)
|
|
This file is included from
xmltest.xml: