xHarbour Reference Documentation > Class Reference (textmode) |
Creates a new THtmlDocument object.
THtmlDocument():new( [<cHtmlDocument>] ) --> oTHtmlDocument
Function THtmlDocument() creates the object and method :new() initializes it.
The THtmlDocument() class provides objects for reading and creating HTML files and streams. HTML stands for Hyper Text Markup Language which is the standard file format for documents published in the internet. To learn more about HTML itself, the internet provides very good free online tutorials. The website www.w3schools.com is is a good place to quickly learn the basics on HTML.
A THtmlDocument object maintains an entire HTML document and builds from it a tree of THtmlNode() objects which contain the actual HTML data. The first HTML node is stored in the :root instance variable, which is the root node of the HTML tree. Beginning with the root node, an HTML document can be traversed or searched for particular data. The classes THtmlIteratorScan() and THtmlIteratorRegEx() are available to find a particular HTML node, based on its tag name, attribute or textual content.
Besides the root node, a THtmlDocument object has two standard nodes :head and :body.
See also: | THtmlCleanup(), THtmlInit(), THtmlNode(), TIpClientHttp() |
Category: | HTML functions , Object functions , xHarbour extensions |
Source: | tip\thtml.prg |
LIB: | xhb.lib |
DLL: | xhbdll.dll |
Creating a simple HTML page
// The example creates a HTML document from a simple HTML string PROCEDURE Main LOCAL cString := "<p>Hello <p>world" LOCAL oHtmlDoc := THtmlDocument():new( cString ) ? oHtmlDoc:toString() ** output // <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> // <html> // <head> // </head> // <body> // <p>Hello // <p>world // </body> // </html> THtmlCleanUp() RETURN
Loading a HTML page from the internet
// The example loads a HTML page from Google and lists // the links contained in the HTML response. PROCEDURE Main LOCAL oHttp, cHtml, hQuery, oHtmlDoc, oNode, aLink oHttp:= TIpClientHttp():new( "http://www.google.de/search" ) // build the Google query hQUery := Hash() hSetCaseMatch( hQuery, .F. ) hQuery["q"] := "xHarbour" hQuery["hl"] := "en" hQuery["btnG"] := "Google+Search" // add query data to the TUrl object oHttp:oUrl:addGetForm( hQuery ) // Connect to the HTTP server IF .NOT. oHttp:open() ? "Connection error:", oHttp:lastErrorMessage() QUIT ENDIF // downlowad the Google response cHtml := oHttp:readAll() oHttp:close() ? Len(cHtml), "bytes received " oHtmlDoc := THtmlDocument():new( cHtml ) oHtmlDoc:writeFile( "Google.html" ) // ":a" retrieves the first <a href="url"> text </a> tag oNode := oHtmlDoc:body:a ? oNode:getText(""), oNode:href // ":divs(5)" returns the 5th <div> tag oNode := oHtmlDoc:body:divs(5) // "aS" is the plural of "a" and returns all <a href="url"> tags aLink := oNode:aS FOR EACH oNode IN aLink ? HtmlToOem( oNode:getText("") ), oNode:href NEXT RETURN
http://www.xHarbour.com