Package | Description |
---|---|
org.myrobotlab.document.transformer | |
org.myrobotlab.document.workflow |
Modifier and Type | Class and Description |
---|---|
class |
CastValuesToDouble
This stage will iterate the values of the inputField and attempt to cast them
to a double.
|
class |
CastValuesToInt
This stage will iterate the values of the inputField and attempt to cast them
to an integer.
|
class |
CopyField |
class |
CreateStaticTeaser |
class |
DeleteField |
class |
DictionaryLookup |
class |
DivideValues |
class |
DropDocument
DropDocument - if the document contains a particular field value, drop this
document from the workflow.
|
class |
FetchURI
This stage will fetch a web page defined by the uriField and store its byte
array in the bytesField.
|
class |
FormatDate
This stage will take the values in the inputField and attempt to parse them
into a date object based on the formatString.
|
class |
JoinFieldValues
This stage will join together a list of values into a single string value
with a separator.
|
class |
JSoupExtractor
This stage will use a jsoup selection string on html and store the resulting
data int the output field
|
class |
LowercaseFieldNames
This will set a field on a document with a value
|
class |
MathValues |
class |
MultiplyValues |
class |
NormalizeFieldNames
This stage will rename fields on a document to lowercase them, and replace
punctuation with underscores This is useful to make the field names search
engine(solr) friendly.
|
class |
NounPhraseExtractor |
class |
OpenNLP |
class |
ParseDate
This stage will take the values in the inputField and attempt to parse them
into a date object based on the formatString.
|
class |
ParseWikiText |
class |
RegexExtractor
This stage will use a regex to find a pattern in a string field and store the
matched text into the output field.
|
class |
RenameField
This stage will rename the field on a document.
|
class |
RenameFields
This stage will rename the field on a document.
|
class |
SendToSolr
This stage will convert an MRL document to a solr document.
|
class |
SetStaticFieldValue
This will set a field on a document with a value
|
class |
SumValues |
class |
TextExtractor
This stage will use Apache Tika to perform text and metadata extraction on
many different types of documents including, but not limited to, pdf, office
documents, html, etc..
|
class |
TruncateFieldValues
This stage will remove all but the first value from a field on a document.
|
class |
UniqueFieldValues
This stage will remove all duplicate values for a given field on a document.
|
class |
XPathExtractor
This stage will load a config file that contains a field name to xpath
expression mapping.
|
class |
ZipFieldValues
This stage will take two fields that have equal sized lists of values it will
then iterates the values of both fields adding them to an outputField (Like a
zipper!)
|
Modifier and Type | Method and Description |
---|---|
void |
WorkflowWorker.addStage(AbstractStage stage) |
Copyright © 2020 myrobotlab. All rights reserved.