| xHarbour Reference Documentation > Class Reference (textmode) |
![]() |
![]() |
![]() |
Creates a new THtmlDocument object.
THtmlDocument():new( [<cHtmlDocument>] ) --> oTHtmlDocument
Function THtmlDocument() creates the object and method :new() initializes it.
The THtmlDocument() class provides objects for reading and creating HTML files and streams. HTML stands for Hyper Text Markup Language which is the standard file format for documents published in the internet. To learn more about HTML itself, the internet provides very good free online tutorials. The website www.w3schools.com is is a good place to quickly learn the basics on HTML.
A THtmlDocument object maintains an entire HTML document and builds from it a tree of THtmlNode() objects which contain the actual HTML data. The first HTML node is stored in the :root instance variable, which is the root node of the HTML tree. Beginning with the root node, an HTML document can be traversed or searched for particular data. The classes THtmlIteratorScan() and THtmlIteratorRegEx() are available to find a particular HTML node, based on its tag name, attribute or textual content.
Besides the root node, a THtmlDocument object has two standard nodes :head and :body.
| See also: | THtmlCleanup(), THtmlInit(), THtmlNode(), TIpClientHttp() |
| Category: | HTML functions , Object functions , xHarbour extensions |
| Source: | tip\thtml.prg |
| LIB: | xhb.lib |
| DLL: | xhbdll.dll |
Creating a simple HTML page
// The example creates a HTML document from a simple HTML string
PROCEDURE Main
LOCAL cString := "<p>Hello <p>world"
LOCAL oHtmlDoc := THtmlDocument():new( cString )
? oHtmlDoc:toString()
** output
// <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
// <html>
// <head>
// </head>
// <body>
// <p>Hello
// <p>world
// </body>
// </html>
THtmlCleanUp()
RETURN
Loading a HTML page from the internet
// The example loads a HTML page from Google and lists
// the links contained in the HTML response.
PROCEDURE Main
LOCAL oHttp, cHtml, hQuery, oHtmlDoc, oNode, aLink
oHttp:= TIpClientHttp():new( "http://www.google.de/search" )
// build the Google query
hQUery := Hash()
hSetCaseMatch( hQuery, .F. )
hQuery["q"] := "xHarbour"
hQuery["hl"] := "en"
hQuery["btnG"] := "Google+Search"
// add query data to the TUrl object
oHttp:oUrl:addGetForm( hQuery )
// Connect to the HTTP server
IF .NOT. oHttp:open()
? "Connection error:", oHttp:lastErrorMessage()
QUIT
ENDIF
// downlowad the Google response
cHtml := oHttp:readAll()
oHttp:close()
? Len(cHtml), "bytes received "
oHtmlDoc := THtmlDocument():new( cHtml )
oHtmlDoc:writeFile( "Google.html" )
// ":a" retrieves the first <a href="url"> text </a> tag
oNode := oHtmlDoc:body:a
? oNode:getText(""), oNode:href
// ":divs(5)" returns the 5th <div> tag
oNode := oHtmlDoc:body:divs(5)
// "aS" is the plural of "a" and returns all <a href="url"> tags
aLink := oNode:aS
FOR EACH oNode IN aLink
? HtmlToOem( oNode:getText("") ), oNode:href
NEXT
RETURN
http://www.xHarbour.com