Next Generation Data Processing and Data Integration Platform
StreamHorizon is the next generation ETL Big Data processing platform.
StreamHorizon is hardware efficient, scalable and extensible replacement for existing legacy stove-pipe ETL platforms.
We have named our novel approach to ETL architecture adaptiveETL.
Testimony to performance of adaptiveETL is a fact that processing throughput of 1+ million records per second is achievable on a single commodity server. Download and test StreamHorizon Demo Data Warehousing project in one hour
StreamHorizon Oracle & MSSQL demo videos are available at StreamHorizon YouTube channel.
Easy to develop
Rather than custom code ETL logic, StreamHorizon enables you to fully configure your ETL logic within single & intuitive XML configuration file.
More than 90% of ETL code is needlessly developed (coding of dimensional ELT logic, fact table inserts/updates, data quality transformations). StreamHorizon eliminates the need for custom coding of such ETL transformations, instead, entire ETL stream is configurable via single, intuitive XML file.
Built in ETL functionality
Forget about coding custom code Type0,1,2,3... dimensional logic, deriving surrogate keys for Fact table or coding any other common ETL transformations!
StreamHorizon simply enables you to configure XML element with table name and dimension type.
ETL logic is thereby transparently generated & executed by StreamHorizon engine based on your configuration.
Scaling number of your ETL processes and making them run in parallel is achieved by simply specifying number of parallel instances (or threads) you wish to run.
There is no need to setup parameter files (or equivalents) before ETL execution.
There is no need to create numerous copies of the same ETL flow in order to achieve ETL flow parallelism.
If you wish to implement specific ETL logic rather than use available XML configurable options you may simply supply Java class and thereby override default
behaviour of StreamHorizon engine.
Rich plugin ecosystem enables you to hook-into different part of StreamHorizon execution and become a part of processing pipeline.
Quick Time to Market
Customizing XML configuration rather than developing ETL code enables you to deliver fully functional Data Marts/Data Warehouses in matter of days.
Demo & Sample Data Mart - StreamHorizon Sample Demo Data Mart can be used as starting base for your project. It comes with full configuration implementation of ETL logic for Oracle, MSSQL and MySQL databases.
Minimize I/O operations
StreamHorizon performs all data transformation steps on the fly, within single ETL process. Thereby, you can eliminate Staging area from your ETL processes if it fits your data processing design.
Clusterable, Virtualizable & Hadoop ready
StreamHorizon can run on Hadoop Ecosystem, is cloud-ready and can be 100% virtualized.
StreamHorizon enables easy integration with major big-data frameworks like Storm, Spark, Samza or Hadoop (HDFS and HBase).
StreamHorizon is implemented as highly efficient Java library. It can process millions of events per second on standard laptop.
It has low hardware footprint and can be deployed on workstations, commodity servers and compute and data
clusters, on all major operative systems which run Java.
Encapsulating In Memory Data Grid
StreamHorizon ETL processes utilize embedded Infinispan (or Hazelcast) data grids which are internally used by StreamHorizon engine during data processing. This is one of the reasons
why StreamHorizon can achieve very high throughput and horizontal scalability.
StreamHorizon in more detail
StreamHorizon is the next generation ETL Big Data processing platform. It is highly efficient and very extensible replacement for legacy stove-pipe ETL platforms. We call this adaptiveETL.
- Data Processing throughput of 1+ Million Records per second (single commodity server). Download and test StreamHorizon Demo Data Warehousing project in one hour
- StreamHorizon Oracle & MSSQL demo videos are available at StreamHorizon YouTube channel
- Quick Time to Market – deploy StreamHorizon & deliver your Data integration project in a single week
- Fully Configurable via XML
- Requires no ETL platform specific knowledge
- Shift from coding to XML configuration and reduce IT skills required to deliver, manage, run & outsource projects
- Eliminated 90+% manual coding
- Flexible & Customizable – override any default behaviour of StreamHorizon platform with custom Java, OS script or SQL implementation
- No vendor lock-in (all custom written code runs outside StreamHorizon platform - there is no need to re-develop code if migrating your existing solution)
- No ETL tool specific language
- Out of the box features like Type 0,1,2, Custom dimensions, dynamic In Memory Cache formation transparent to developer
- Delivering performance critical Big Data projects
- Massively parallel data streaming engine
- Transparently backed with In Memory Data Grid (Coherence, Infinispan, Hazelcast, any other.)
- ETL processes run in memory and interact with cache (In Memory Data Grid)
- Unnecessary Staging (I/O expensive) ETL steps are eliminated
- Lambda Architecture - Hadoop (& non-Hadoop) real time & batch oriented data streaming/processing architecture
- Data Streaming & Micro batch Architecture
- Massively parallel conventional ETL Architecture
- Batch oriented conventional ETL Architecture
- 1 Hour Proof of Concept – download and test-run StreamHorizon's demo Data Warehousing project
- Runs on Big Data clusters: Hadoop, HDFS, Kafka, Spark, Storm, Hive, Impala and more...
- Run your StreamHorizon Data Processing Cluster (ETL grid)
- Runs on Compute Grid (alongside grid libraries like Quant Library or any other)
- Horizontally & Vertically scalable, Highly Available (HA), Clusterable
- Running on Linux, Solaris, Windows, Compute Clouds (EC2 & others)
Read In More detail about: