[All Platforms] - String Tokeniser (OOP Paradigm)
Posted: Sun Dec 04, 2011 6:07 pm
Hey again,
Here's a little piece of code that provides the programmer with a String Tokeniser (Tokenizer for our non-Brits) interface, providing an OOP-Style method for tokenising strings.
With the ability of adding multiple delimiters, this 'Object' or 'Class' will provide the programmer something a little more advanced than StringField().
Instantiation
I encourage you to rip this apart, and offer suggestions, advice and constructive criticism... after all, a better solution to the problem is a better solution. 
Thanks
Alternatively, a Procedural version of this tokeniser can be found Here
Note - With OOP being a pretty taboo thing in the land of PureBasic, it is entirely your choice to use this code. Please don't open a debate about To OOP or Not To OOP.
Here's a little piece of code that provides the programmer with a String Tokeniser (Tokenizer for our non-Brits) interface, providing an OOP-Style method for tokenising strings.
With the ability of adding multiple delimiters, this 'Object' or 'Class' will provide the programmer something a little more advanced than StringField().
Instantiation
- IStringTokeniser_New() - Create a new instance of the tokeniser.
- IStringTokeniser_ReleaseAll() - Release ALL instances of the tokeniser.
- AddDelimiter(Delimiter$) - Add a delimiter to the list. You can specify multiple delimiters per single call of this method.
- RemoveDelimiter(Delimiter$) - Remove a delimiter from the list. You can specify multiple delimiters per single call of this method.
- IsDelimiter(Delimiter$) - Check if the passed Delimiter$ (Supports Single Character) is a defined delimiter.
- ParseString(String$) - Parse a string through the tokeniser. Returns number of tokens found.
- FirstToken() - Get the first found token. Returns an empty string if no tokens found.
- NextToken() - Get the next found token. Returns an empty string if no more tokens exist.
- TokenPosition() - Get the starting position in the parsed string of the current token.
- Release() - Release this instance of the tokeniser.
Code: Select all
; ----------------------------------------------------------------------------------------------------
; Title: String Tokeniser (OOP Interface)
; Description: Interface that provides a String Tokeniser.
; Author(s): Michael R. King (mrking2910@gmail.com)
; Revision: 1
; Support: Cross-Platform
;
; Notes: Instances stored in a Linked List. Use *Object/Release() to free the allocation.
;
; ----------------------------------------------------------------------------------------------------
EnableExplicit
CompilerIf Defined(_PBI_ISTRINGTOKENISER_, #PB_Constant) = #False
#_PBI_ISTRINGTOKENISER_ = #True
; - Interface Definition -
Interface IStringTokeniser
AddDelimiter(Delimiter$) ; - Add a delimiter to the list. You can specify multiple delimiters per single call of this method.
RemoveDelimiter(Delimiter$) ; - Remove a delimiter from the list. You can specify multiple delimiters per single call of this method.
IsDelimiter.a(Delimiter$) ; - Check if the passed Delimiter$ (Single Character) is a defined delimiter.
ParseString.l(String$) ; - Parse a string through the tokeniser. Returns number of tokens found.
FirstToken.s() ; - Get the first found token. Returns an empty string if no tokens found.
NextToken.s() ; - Get the next found token. Returns an empty string if no more tokens exist.
TokenPosition.l() ; - Get the starting position in the parsed string of the current token.
Release() ; - Release this instance of the tokeniser.
EndInterface
; - Structures -
Structure __IStringTokeniser_Token
m_token.s
m_location.l
EndStructure
; - Instance Data -
Structure __IStringTokeniser_Instance
*m_funcTable
*m_cToken.__IStringTokeniser_Token
List m_delimiter.s()
List m_token.__IStringTokeniser_Token()
EndStructure
; - Instantiation List -
Global NewList __IStringTokeniser_InstanceList.__IStringTokeniser_Instance()
; - Instantiation -
Procedure.i IStringTokeniser_New()
Protected *obj.__IStringTokeniser_Instance = AddElement(__IStringTokeniser_InstanceList())
With *obj
\m_funcTable = ?__IStringTokeniser_FuncTable
\m_cToken = #Null
EndWith
ProcedureReturn *obj
EndProcedure
Procedure IStringTokeniser_ReleaseAll()
ForEach __IStringTokeniser_InstanceList()
With __IStringTokeniser_InstanceList()
FreeList(\m_delimiter())
FreeList(\m_token())
EndWith
Next
ClearList(__IStringTokeniser_InstanceList())
EndProcedure
Procedure __IStringTokeniser_Release(*obj.__IStringTokeniser_Instance)
ForEach __IStringTokeniser_InstanceList()
If @__IStringTokeniser_InstanceList() = *obj
With *obj
FreeList(\m_delimiter())
FreeList(\m_token())
DeleteElement(__IStringTokeniser_InstanceList())
EndWith
EndIf
Next
EndProcedure
; - Object Methods -
Procedure __IStringTokeniser_AddDelimiter(*obj.__IStringTokeniser_Instance, Delimiter$)
Protected *this.IStringTokeniser = *obj
Protected dCnt.l, dIx.l, d.s
With *obj
dCnt = Len(Delimiter$)
If dCnt > 0
If dCnt > 1
For dIx = 1 To dCnt
d = Mid(Delimiter$, dIx, 1)
*this\AddDelimiter(d)
Next
Else
If *this\IsDelimiter(Delimiter$) = #False
AddElement(\m_delimiter())
\m_delimiter() = Delimiter$
EndIf
EndIf
EndIf
EndWith
EndProcedure
Procedure __IStringTokeniser_RemoveDelimiter(*obj.__IStringTokeniser_Instance, Delimiter$)
Protected *this.IStringTokeniser = *obj
Protected dCnt.l, dIx.l, d.s
With *obj
dCnt = Len(Delimiter$)
If dCnt > 0
If dCnt > 1
For dIx = 1 To dCnt
d = Mid(Delimiter$, dIx, 1)
*this\RemoveDelimiter(d)
Next
Else
ForEach \m_delimiter()
If \m_delimiter() = Delimiter$
DeleteElement(\m_delimiter())
ProcedureReturn
EndIf
Next
EndIf
EndIf
EndWith
EndProcedure
Procedure.a __IStringTokeniser_IsDelimiter(*obj.__IStringTokeniser_Instance, Delimiter$)
Protected *this.IStringTokeniser = *obj
With *obj
ForEach \m_delimiter()
If \m_delimiter() = Delimiter$
ProcedureReturn #True
EndIf
Next
EndWith
ProcedureReturn #False
EndProcedure
Procedure __IStringTokeniser_ParseString(*obj.__IStringTokeniser_Instance, String$)
Protected *this.IStringTokeniser = *obj
Protected cCnt.l, cIx.l, c.s, tCnt.l, t.s, tLoc.l
tLoc = -1
With *obj
ClearList(\m_token())
cCnt = Len(String$)
For cIx = 1 To cCnt + 1
c = Mid(String$, cIx, 1)
If *this\IsDelimiter(c) Or cIx = (cCnt + 1)
If Len(t) > 0
AddElement(\m_token())
\m_token()\m_token = t
\m_token()\m_location = tLoc
EndIf
t = ""
tLoc = -1
Else
If tLoc = -1
tLoc = cIx
EndIf
t = t + c
EndIf
Next
ProcedureReturn ListSize(\m_token())
EndWith
EndProcedure
Procedure.s __IStringTokeniser_FirstToken(*obj.__IStringTokeniser_Instance)
Protected *this.IStringTokeniser = *obj
With *obj
If FirstElement(\m_token())
\m_cToken = @\m_token()
ProcedureReturn \m_cToken\m_token
Else
\m_cToken = #Null
EndIf
ProcedureReturn ""
EndWith
EndProcedure
Procedure.s __IStringTokeniser_NextToken(*obj.__IStringTokeniser_Instance)
Protected *this.IStringTokeniser = *obj
With *obj
If NextElement(\m_token())
\m_cToken = @\m_token()
ProcedureReturn \m_cToken\m_token
Else
\m_cToken = #Null
EndIf
ProcedureReturn ""
EndWith
EndProcedure
Procedure.l __IStringTokeniser_TokenPosition(*obj.__IStringTokeniser_Instance)
Protected *this.IStringTokeniser = *obj
With *obj
If \m_cToken <> #Null
ProcedureReturn \m_cToken\m_location
EndIf
ProcedureReturn -1
EndWith
EndProcedure
; - Method Table -
DataSection
__IStringTokeniser_FuncTable:
Data.i @__IStringTokeniser_AddDelimiter()
Data.i @__IStringTokeniser_RemoveDelimiter()
Data.i @__IStringTokeniser_IsDelimiter()
Data.i @__IStringTokeniser_ParseString()
Data.i @__IStringTokeniser_FirstToken()
Data.i @__IStringTokeniser_NextToken()
Data.i @__IStringTokeniser_TokenPosition()
Data.i @__IStringTokeniser_Release()
EndDataSection
CompilerEndIf ;_PBI_ISTRINGTOKENISER_
; -------------------------------------------------------------------------------------
; - DEMONSTRATION CODE - DEMONSTRATION CODE - DEMONSTRATION CODE - DEMONSTRATION CODE -
; -------------------------------------------------------------------------------------
; - Define Test String -
Define.s TestString = "Hello World. This is a test string!"
; - Configure Tokeniser
Define *Tokeniser.IStringTokeniser = IStringTokeniser_New()
*Tokeniser\AddDelimiter(" ")
; - Parse String -
Debug "Parsing String: '" + TestString + "'."
*Tokeniser\ParseString(TestString)
; - Output Results -
Define.s Token = *Tokeniser\FirstToken()
While Token
Debug "Token: '" + Token + "' found at: [" + Str(*Tokeniser\TokenPosition()) + "]."
Token = *Tokeniser\NextToken()
Wend
; - Release Tokeniser -
*Tokeniser\Release()

Thanks

Alternatively, a Procedural version of this tokeniser can be found Here
Note - With OOP being a pretty taboo thing in the land of PureBasic, it is entirely your choice to use this code. Please don't open a debate about To OOP or Not To OOP.
