Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions source/template-pipelines/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,12 @@ and this project adheres to [Semantic Versioning](http://semver.org/).

### Removed

## [template-pipelines-v0.5.1]

### Added

* `Event Data Split (JSON)`, `Event Data Split (Text)` and `Event Data Split (XML)` pipelines to support multiple feed destinations dependent on additional filtering XSLTs

## [template-pipelines-v0.5]

### Changed
Expand Down Expand Up @@ -63,6 +69,7 @@ Stroom up to and including `v7.0`.


[Unreleased]: https://github.com/gchq/stroom-content/compare/template-pipelines-v0.5...HEAD
[template-pipelines-v0.5.1]: https://github.com/gchq/stroom-content/compare/template-pipelines-v0.5...template-pipelines-v0.5.1
[template-pipelines-v0.5]: https://github.com/gchq/stroom-content/compare/template-pipelines-v0.4.1...template-pipelines-v0.5
[template-pipelines-v0.4.1]: https://github.com/gchq/stroom-content/compare/template-pipelines-v0.3...template-pipelines-v0.4.1
[template-pipelines-v0.4]: https://github.com/gchq/stroom-content/compare/template-pipelines-v0.3...template-pipelines-v0.4
Expand Down
17 changes: 17 additions & 0 deletions source/template-pipelines/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ The following represents the folder structure and content that will be imported
* [Event Data (JSON)](#event-data-json) `Pipeline`
* [Event Data (Text)](#event-data-text) `Pipeline`
* [Event Data (XML)](#event-data-xml) `Pipeline`
* [Event Data Split (JSON)](#event-data-split-json) `Pipeline`
* [Event Data Split (Text)](#event-data-split-text) `Pipeline`
* [Event Data Split (XML)](#event-data-split-xml) `Pipeline`
* [Indexing](#indexing) `Pipeline`
* [JSON](#json-xslt) `XSLT`
* [Reference Data](#reference-data) `Pipeline`
Expand Down Expand Up @@ -50,6 +53,20 @@ Inherits from Event Data Base, adding a Data Splitter parser in front of it. Thi

Inherits from Event Data Base, adding a Data Splitter parser in front of it. This pipeline can be used as a template for pipelines processing text format data (e.g. Apache logs) into event-logging format XML. Pipelines inheriting from this will need to supply as a minimum a Text Converter to convert the text into XML, an XSLT translation to convert this XML into event-logging form and an XSLT translation to decorate the event-logging XML with any additional data (e.g. IP -> hostname lookups).

## Event Data Split (JSON)

This pipeline can be used as a template for pipelines processing JSON format data into event-logging format XML where the events can be split in half for storage in different streamAppender event feeds. Typically this is used to break event data into one stream that should be held for a different time period to that of another. For example, all user attributed events could go to one output stream held for 7 years and all others could go to another output stream that could be held for say 1 year. Pipelines inheriting from this will need to supply as a mimumum a JSON Parser (`jsonParser`) to convert JSON fragments into XML, one or more XSLT translations (`preTranslationFilter`, `initialTranslationFilter`) to convert this XML into event-logging form and then two XSLT translations (`translationFilterA`, `translationFilterB`) to split the event-logging XML form into differing sub-pipelines than can individually decorate then store the events in a given event feed.

## Event Data Split (Text)

This pipeline can be used as a template for pipelines processing text format data (e.g. Apache logs) into event-logging format XML where the events can be split in half for storage in different streamAppender event feeds. Typically this is used to break event data into one stream that should be held for a different time period to that of another. For example, all user attributed events could go to one output stream held for 7 years and all others could go to another output stream that could be held for say 1 year. Pipelines inheriting from this will need to supply as a mimumum a Text Converter (`dsParser`) to convert text into XML, one or more XSLT translations (`preTranslationFilter`, `initialTranslationFilter`) to convert this XML into event-logging form and then two XSLT translations (`translationFilterA`, `translationFilterB`) to split the event-logging XML form into differing sub-pipelines than can individually decorate then store the events in a given event feed.


## Event Data Split (XML)

This pipeline can be used as a template for pipelines processing text format data (e.g. Apache logs) into event-logging format XML where the events can be split in half for storage in different streamAppender event feeds. Typically this is used to break event data into one stream that should be held for a different time period to that of another. For example, all user attributed events could go to one output stream held for 7 years and all others could go to another output stream that could be held for say 1 year. Pipelines inheriting from this will need to supply as a mimumum a XML Fragment parser (`xmlFragmentParser`), one or more XSLT translations (`preTranslationFilter`, `initialTranslationFilter`) to convert this XML into event-logging form and then two XSLT translations (`translationFilterA`, `translationFilterB`) to split the event-logging XML form into differing sub-pipelines than can individually decorate then store the events in a given event feed.


## Indexing

<!--TODO-->
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"type" : "Pipeline",
"uuid" : "1b7f5d6e-5db6-44e8-a45e-e9315070b940",
"name" : "Event Data Split (JSON)",
"version" : "ff930205-4eff-4ff2-bc3e-4d251e7a65bc",
"description" : "This pipeline is designed to allow one to split data into two output streams. Typically this is used to break event data into one stream that should be held for a different time period to that of another. For example, all user attributed events could go to one output stream held for 7 years and all others could go to another output stream that could be held for say 1 year."
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
name=Event Data Split (JSON)
path=Template Pipelines
type=Pipeline
uuid=1b7f5d6e-5db6-44e8-a45e-e9315070b940
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
<?xml version="1.1" encoding="UTF-8"?>
<pipeline>
<elements>
<add>
<element>
<id>Source</id>
<type>Source</type>
</element>
<element>
<id>jsonParser</id>
<type>JSONParser</type>
</element>
<element>
<id>readRecordCountFilter</id>
<type>RecordCountFilter</type>
</element>
<element>
<id>splitFilter</id>
<type>SplitFilter</type>
</element>
<element>
<id>preTranslationFilter</id>
<type>XSLTFilter</type>
</element>
<element>
<id>initialTranslationFilter</id>
<type>XSLTFilter</type>
</element>
<element>
<id>translationFilterA</id>
<type>XSLTFilter</type>
</element>
<element>
<id>translationFilterB</id>
<type>XSLTFilter</type>
</element>
<element>
<id>decorationFilterA</id>
<type>XSLTFilter</type>
</element>
<element>
<id>decorationFilterB</id>
<type>XSLTFilter</type>
</element>
<element>
<id>schemaFilterA</id>
<type>SchemaFilter</type>
</element>
<element>
<id>schemaFilterB</id>
<type>SchemaFilter</type>
</element>
<element>
<id>recordOutputFilterA</id>
<type>RecordOutputFilter</type>
</element>
<element>
<id>recordOutputFilterB</id>
<type>RecordOutputFilter</type>
</element>
<element>
<id>writeRecordCountFilterA</id>
<type>RecordCountFilter</type>
</element>
<element>
<id>writeRecordCountFilterB</id>
<type>RecordCountFilter</type>
</element>
<element>
<id>xmlWriterA</id>
<type>XMLWriter</type>
</element>
<element>
<id>xmlWriterB</id>
<type>XMLWriter</type>
</element>
<element>
<id>streamAppenderA</id>
<type>StreamAppender</type>
</element>
<element>
<id>streamAppenderB</id>
<type>StreamAppender</type>
</element>
</add>
</elements>
<properties>
<add>
<property>
<element>writeRecordCountFilterA</element>
<name>countRead</name>
<value>
<boolean>false</boolean>
</value>
</property>
<property>
<element>writeRecordCountFilterB</element>
<name>countRead</name>
<value>
<boolean>false</boolean>
</value>
</property>
</add>
</properties>
<links>
<add>
<link>
<from>Source</from>
<to>jsonParser</to>
</link>
<link>
<from>jsonParser</from>
<to>readRecordCountFilter</to>
</link>
<link>
<from>readRecordCountFilter</from>
<to>splitFilter</to>
</link>
<link>
<from>splitFilter</from>
<to>preTranslationFilter</to>
</link>
<link>
<from>preTranslationFilter</from>
<to>initialTranslationFilter</to>
</link>
<link>
<from>initialTranslationFilter</from>
<to>translationFilterA</to>
</link>
<link>
<from>initialTranslationFilter</from>
<to>translationFilterB</to>
</link>
<link>
<from>translationFilterA</from>
<to>decorationFilterA</to>
</link>
<link>
<from>translationFilterB</from>
<to>decorationFilterB</to>
</link>
<link>
<from>decorationFilterA</from>
<to>schemaFilterA</to>
</link>
<link>
<from>decorationFilterB</from>
<to>schemaFilterB</to>
</link>
<link>
<from>schemaFilterA</from>
<to>recordOutputFilterA</to>
</link>
<link>
<from>schemaFilterB</from>
<to>recordOutputFilterB</to>
</link>
<link>
<from>recordOutputFilterA</from>
<to>writeRecordCountFilterA</to>
</link>
<link>
<from>recordOutputFilterB</from>
<to>writeRecordCountFilterB</to>
</link>
<link>
<from>writeRecordCountFilterA</from>
<to>xmlWriterA</to>
</link>
<link>
<from>writeRecordCountFilterB</from>
<to>xmlWriterB</to>
</link>
<link>
<from>xmlWriterA</from>
<to>streamAppenderA</to>
</link>
<link>
<from>xmlWriterB</from>
<to>streamAppenderB</to>
</link>
</add>
</links>
</pipeline>
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"type" : "Pipeline",
"uuid" : "39ac6789-2655-4040-96f2-276509ace0ae",
"name" : "Event Data Split (Text)",
"version" : "183e6941-6a31-4ce0-ad99-a98da3c156a0",
"description" : "This pipeline is designed to allow one to split data into two output streams. Typically this is used to break event data into one stream that should be held for a different time period to that of another. For example, all user attributed events could go to one output stream held for 7 years and all others could go to another output stream that could be held for say 1 year."
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
name=Event Data Split (Text)
path=Template Pipelines
type=Pipeline
uuid=39ac6789-2655-4040-96f2-276509ace0ae
Loading