cppexpose  1.0.0.b785e04f23b8
C++ library for type introspection, reflection, and scripting interface
Classes | Public Types | Public Member Functions | List of all members
cppexpose::Tokenizer Class Reference

Text parser tool that converts a text buffer into a stream of tokens. More...

#include <cppexpose/include/cppexpose/base/Tokenizer.h>

Classes

struct  Lookahead
 Token information from lookahead. More...
 
struct  Token
 Token. More...
 

Public Types

enum  Option {
  OptionParseStrings = 1, OptionParseNumber = 2, OptionParseBoolean = 4, OptionParseNull = 8,
  OptionCStyleComments = 16, OptionCppStyleComments = 32, OptionShellStyleComments = 64, OptionIncludeComments = 128
}
 Parser options. More...
 
enum  TokenType {
  TokenEndOfStream, TokenWhitespace, TokenComment, TokenStandalone,
  TokenString, TokenNumber, TokenBoolean, TokenNull,
  TokenSingleChar, TokenWord
}
 Token types. More...
 

Public Member Functions

 Tokenizer ()
 Constructor. More...
 
 ~Tokenizer ()
 Destructor. More...
 
unsigned int options () const
 Get parsing options. More...
 
void setOptions (unsigned int options)
 Set parsing options. More...
 
bool hasOption (Option option) const
 Check if a specific parsing option is set. More...
 
const std::string & whitespace () const
 Get whitespace characters. More...
 
void setWhitespace (const std::string &whitespace)
 Set whitespace characters. More...
 
const std::string & quotationMarks () const
 Get quotation marks. More...
 
void setQuotationMarks (const std::string &quotationMarks)
 Set quotation marks. More...
 
const std::string & singleCharacters () const
 Get single characters. More...
 
void setSingleCharacters (const std::string &singleCharacters)
 Set single characters. More...
 
const std::vector< std::string > & standalones () const
 Get standalone strings. More...
 
void setStandalones (const std::vector< std::string > &standalones)
 Set standalone strings. More...
 
bool loadDocument (const std::string &filename)
 Load text document to parse. More...
 
void setDocument (const std::string &document)
 Set text document to parse. More...
 
void setDocument (const char *beginDoc, const char *endDoc)
 Set text document to parse. More...
 
Token parseToken ()
 Parse and return the next token. More...
 

Detailed Description

Text parser tool that converts a text buffer into a stream of tokens.

Remarks
A tokenizer takes a text buffer and identifies individual tokens separated by white space. It returns those tokens one by one, removing the white space in between. Based on this low-level parsing tool, text parsers (e.g., JSON) can be implemented.

Member Enumeration Documentation

Parser options.

Enumerator
OptionParseStrings 

Parse strings (use setQuotationMarks to set string characters)

OptionParseNumber 

Parse numbers.

OptionParseBoolean 

Parse boolean values.

OptionParseNull 

Parse null value.

OptionCStyleComments 

Enable "/* */" for multi-line comments.

OptionCppStyleComments 

Enable "//" for one-line comments.

OptionShellStyleComments 

Enable "#" for one-line comments.

OptionIncludeComments 

Include comments in the output of the tokenizer.

Token types.

Enumerator
TokenEndOfStream 

No token read, end of stream reached.

TokenWhitespace 

Token contains only whitespace.

TokenComment 

Token contains a comment.

TokenStandalone 

Token contains a standalone string.

TokenString 

Token contains a string.

TokenNumber 

Token contains number.

TokenBoolean 

Token contains a boolean value.

TokenNull 

Token contains a null value.

TokenSingleChar 

Token contains a single character.

TokenWord 

Token contains a regular word (any other than above)

Constructor & Destructor Documentation

cppexpose::Tokenizer::Tokenizer ( )

Constructor.

cppexpose::Tokenizer::~Tokenizer ( )

Destructor.

Member Function Documentation

unsigned int cppexpose::Tokenizer::options ( ) const

Get parsing options.

Returns
Parsing options
void cppexpose::Tokenizer::setOptions ( unsigned int  options)

Set parsing options.

Parameters
[in]optionsParsing options
bool cppexpose::Tokenizer::hasOption ( Option  option) const

Check if a specific parsing option is set.

Returns
'true' if option is set, else 'false'
const std::string& cppexpose::Tokenizer::whitespace ( ) const

Get whitespace characters.

Returns
Characters that are considered whitespace
void cppexpose::Tokenizer::setWhitespace ( const std::string &  whitespace)

Set whitespace characters.

Parameters
[in]whitespaceCharacters that are considered whitespace
const std::string& cppexpose::Tokenizer::quotationMarks ( ) const

Get quotation marks.

Returns
Characters that can enclose a string
void cppexpose::Tokenizer::setQuotationMarks ( const std::string &  quotationMarks)

Set quotation marks.

Parameters
[in]quotationMarksCharacters that can enclose a string
const std::string& cppexpose::Tokenizer::singleCharacters ( ) const

Get single characters.

Returns
Characters that stand on their own
void cppexpose::Tokenizer::setSingleCharacters ( const std::string &  singleCharacters)

Set single characters.

Parameters
[in]singleCharactersCharacters that stand on their own
const std::vector<std::string>& cppexpose::Tokenizer::standalones ( ) const

Get standalone strings.

Returns
Strings that stand on their own
void cppexpose::Tokenizer::setStandalones ( const std::vector< std::string > &  standalones)

Set standalone strings.

Parameters
[in]standalonesStrings that stand on their own
bool cppexpose::Tokenizer::loadDocument ( const std::string &  filename)

Load text document to parse.

Parameters
[in]filenameFilename of text document
Returns
'true' if file could be loaded, else 'false'
void cppexpose::Tokenizer::setDocument ( const std::string &  document)

Set text document to parse.

Parameters
[in]documentText document
void cppexpose::Tokenizer::setDocument ( const char *  beginDoc,
const char *  endDoc 
)

Set text document to parse.

Parameters
[in]beginDocPointer to the first character inside the document
[in]endDocPointer to the first character outside of the document
Token cppexpose::Tokenizer::parseToken ( )

Parse and return the next token.

Returns
Token
Remarks
If there are no tokens left, token.type is set to TokenEndOfStream.

The documentation for this class was generated from the following file: