xHarbour Reference Documentation > Function Reference xHarbour Developers Network  

TokenInit()

Initializes the environment for the incremental tokenizer.

Syntax

TokenInit( [@<cString>]  , ;
           [<cDelimiter>], ;
           [<nSkipWidth>], ;
           [@<cTokenEnv>]  ) --> lSuccess

Arguments

@<cString>
This is a character string to be tokenized. Tokens are retrieved from the string with TokenNext(). Note that <cString> must be passed by reference. If <cString> is omitted, the global tokenizer environment is reset.
<cDelimiter>
This character string holds a list of characters recognized as delimiters between tokens. The default list of delimiters consist of non-printable characters having the ASCII codes 0, 9, 10, 13, 26, 32, 138 and 141, and the following punctuation characters: ,.;:!?/\<>()!HSH&%+-*
<nSkipWidth>
This optional numeric value defaults to zero. This causes the incremental tokenizer to find empty tokens. To suppress this behavior, set <nSkipWidth> to the length of the largest delimiter.
@<cTokenEnv>
If this parameter is passed by reference, it receives a character string holding a local environment for the incremental tokenizer. This character string must then be passed to other functions of the tokenizer, like TokenNext().

Return

The function returns .T. (true) when the environment for the incremental tokenizer is successfully initialized, otherwise .F. (false) is returned.

Description

TokenInit() initializes the environment of the incremental tokenizer of xHarbour. In contrast to the Clipper CA-Tools, xHarbour maintains one global tokenizer environment and any number of local tokenizer environments. The latter are created by passing a fourth parameter by reference to TokenInit(). <cTokenEnv> receives the local tokenizer environment. As a result, the functions SaveToken() and RestToken() become obsolete via the local tokenizer environment.

Note:  when TokenInit() is called with <cString> only, and no local tokenizer environment is created by passing the fourth parameter by reference, the global tokenizer environment is initialized. The memory resources for the global tokenizer environment must be released afterwards with TokenExit().

Info

See also:HB_ATokens(), RestToken(), SaveToken(), TokenEnd(), TokenNext(), TokenExit()
Category: CT:String manipulation , Character functions , Token functions
Source:ct\token2.c
LIB:xhb.lib
DLL:xhbdll.dll

Examples

Using the global tokenizer environment

// The example calculates line and word count of a text file
// using the global tokenizer environment. To determine the word
// count, each line is tokenized in function WordCount(). To
// accomplish this, the global tokenizer environment is saved
// and restored.

   #define CRLF   Chr(13)+Chr(10)

   PROCEDURE Main
      LOCAL cText  := MemoRead( "Textfile.txt" )
      LOCAL cToken
      LOCAL aLines := {}
      LOCAL nLines := 0
      LOCAL nWords := 0

      // initialize global tokenizer environment
      TokenInit( @cText, CRLF, 2 )

      DO WHILE .NOT. TokenEnd()
         cToken := TokenNext( @cText )

         IF cToken == ""
            // one blank space is an empty line for AChoice()
            cToken := " "
         ENDIF

         nLines ++
         nWords += WordCount( @cToken )

         AAdd( aLines, cToken )
      ENDDO

      // release global tokenizer environment
      TokenExit()

      // display the text file
      AChoice( ,,,, aLines )

      CLS
      ? "Line count:", nLines
      ? "Word count:", nWords
   RETURN


   FUNCTION WordCount( cText )
      LOCAL cSave  := SaveToken()
      LOCAL nWords := 0

      TokenInit( @cText, " ,.!?" )

      DO WHILE .NOT. TokenEnd()
         TokenNext( @cText )
         nWords ++
      ENDDO

      RestToken( cSave )

   RETURN nWords

 

Using a local tokenizer environment

// This example does the same, but takes advantage of local tokenizer
// environments. The performance is about 20% better compared to the
// global tokenizer environment, since SaveToken() and RestToken() are
// not needed.

   #define CRLF   Chr(13)+Chr(10)

   PROCEDURE Main
      LOCAL cText  := MemoRead( "Textfile.txt" )
      LOCAL cToken, cTokenEnv
      LOCAL aLines := {}
      LOCAL nLines := 0
      LOCAL nWords := 0

      // initialize local tokenizer environment
      TokenInit( @cText, CRLF, 2, @cTokenEnv )

      DO WHILE .NOT. TokenEnd( @cTokenEnv )
         cToken := TokenNext( @cText, , @cTokenEnv )

         IF cToken == ""
            // one blank space is an empty line for AChoice()
            cToken := " "
         ENDIF

         nLines ++
         nWords += WordCount( @cToken )

         AAdd( aLines, cToken )
      ENDDO

      // display the text file
      AChoice( ,,,, aLines )

      CLS
      ? "Line count:", nLines
      ? "Word count:", nWords
   RETURN


   FUNCTION WordCount( cText )
      LOCAL cTokenEnv
      LOCAL nWords := 0

      TokenInit( @cText, " ,.!?", @cTokenEnv )

      DO WHILE .NOT. TokenEnd( @cTokenEnv )
         TokenNext( @cText, , @cTokenEnv )
         nWords ++
      ENDDO

   RETURN nWords

Copyright © 2006-2007 xHarbour.com Inc. All rights reserved.
http://www.xHarbour.com
Created by docmaker.exe