org.txt2xml.core
Class Processor

java.lang.Object
  |
  +--org.txt2xml.core.Processor
Direct Known Subclasses:
AbstractRegexProcessor

public abstract class Processor
extends java.lang.Object

Base class for agents that match part(s) of a CharSequence and write XML Elements as match(es) are found. Matched CharSequences are passed onto the sub-Processor for further work. When all matches are done, the remainder of the original CharSequence is passed to the nextProcessor.

Processors act as iterators that update their internal state during a round of matching. Processors are not thread-safe.

Namespaces in generated XML are not yet supported.

Subclasses must override findMatch() getMatchedText() getRemainderText() and optionally resetMatching().

Author:
Steve Meyfroidt

Field Summary
protected  java.lang.CharSequence chars
          The current CharSequence being matched against.
protected  org.xml.sax.ContentHandler handler
          The current ContentHandler to write XML to.
protected static org.xml.sax.Attributes NULL_ATTRIBUTES
           
protected  Processor parent
          Parent Processor that is using this as a sub-Processor.
 
Constructor Summary
Processor()
           
 
Method Summary
protected abstract  boolean findMatch()
          Find next match, updating state appropriately.
protected  void generateEndXmlElement()
          Write the end element for this Processor.
protected  void generateStartXmlElement()
          Write this Processor's start element as a simple element with no attributes.
protected  void generateXmlElementCharacters()
          Write the contents of the element.
 void generateXmlFragment(java.lang.CharSequence text, org.xml.sax.ContentHandler contentHandler)
          Match part of the passed CharSequence.
 java.lang.String getElement()
           
protected abstract  java.lang.CharSequence getMatchedText()
          Override this!
 Processor getNextProcessor()
          Gets the nextProcessor.
protected abstract  java.lang.CharSequence getRemainderText()
          Override this!
 Processor getSubProcessor()
           
protected  void resetMatching()
          Called at start of generateXmlFragment(CharSequence, ContentHandler) to reset the state before starting a round of matching.
 void setElement(java.lang.String elementName)
           
 void setNextProcessor(Processor nextProcessor)
          Sets the nextProcessor.
 void setSubProcessor(Processor subProcessor)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

NULL_ATTRIBUTES

protected static final org.xml.sax.Attributes NULL_ATTRIBUTES

parent

protected Processor parent
Parent Processor that is using this as a sub-Processor. Might be interesting to a child but not used by default here. Updated during generateXmlFragment(CharSequence, ContentHandler).


chars

protected java.lang.CharSequence chars
The current CharSequence being matched against. Set at the start of generateXmlFragment(CharSequence, ContentHandler).


handler

protected org.xml.sax.ContentHandler handler
The current ContentHandler to write XML to. Set at the start of generateXmlFragment(CharSequence, ContentHandler).

Constructor Detail

Processor

public Processor()
Method Detail

generateXmlFragment

public void generateXmlFragment(java.lang.CharSequence text,
                                org.xml.sax.ContentHandler contentHandler)
                         throws org.xml.sax.SAXException
Match part of the passed CharSequence. Pass the matched text to the sub-processor if there is one, else generate a SAX element. When no more matches, pass the remainder of the original CharSequence to the next Processor if there is one.

org.xml.sax.SAXException

generateStartXmlElement

protected void generateStartXmlElement()
                                throws org.xml.sax.SAXException
Write this Processor's start element as a simple element with no attributes. Override if a Processor needs to create a more complex element start.

org.xml.sax.SAXException

generateEndXmlElement

protected void generateEndXmlElement()
                              throws org.xml.sax.SAXException
Write the end element for this Processor.

org.xml.sax.SAXException

generateXmlElementCharacters

protected void generateXmlElementCharacters()
                                     throws org.xml.sax.SAXException
Write the contents of the element. Called only for leaf Processors (ie no sub-Processor). This default implementation writes the matched text as XML characters. Override for other behaviour.

org.xml.sax.SAXException

resetMatching

protected void resetMatching()
Called at start of generateXmlFragment(CharSequence, ContentHandler) to reset the state before starting a round of matching. Override to prepare for a round of matching, eg RegexDelimitedProcessor resets the regex Matcher here.


findMatch

protected abstract boolean findMatch()
Find next match, updating state appropriately. Override this!

Returns:
true if got a match, else false.

getMatchedText

protected abstract java.lang.CharSequence getMatchedText()
Override this!

Returns:
CharSequence for text matched in last findMatch().

getRemainderText

protected abstract java.lang.CharSequence getRemainderText()
Override this!

Returns:
CharSequence for remainder of text after last match.

getNextProcessor

public Processor getNextProcessor()
Gets the nextProcessor.

Returns:
Returns a Processor

setNextProcessor

public void setNextProcessor(Processor nextProcessor)
Sets the nextProcessor.

Parameters:
nextProcessor - The nextProcessor to set

getSubProcessor

public Processor getSubProcessor()
Returns:
the sub-Processor that will process this Processor's matched text.

setSubProcessor

public void setSubProcessor(Processor subProcessor)
Parameters:
subProcessor - the sub-Processor that will process this Processor's matched text.

getElement

public java.lang.String getElement()
Returns:
the name of the element written by this Processor.

setElement

public void setElement(java.lang.String elementName)
Parameters:
elementName - the name of the element written by this Processor.


Copyright 2002 Steve Meyfroidt. All Rights Reserved.