Data Pipelines
• Traditionally implemented entirely using Batch
processing in COSMOS infrastructure
o Storage – DFS (similar to HDFS)
o Execution – Dryad (general purpose, more expressive than map-reduce)
o Query – SCOPE (SQL 'style' scripting language that supports inline C#)
• Data pipelines are adopting near real-time
processing – new issues to address