DATA PIPELINE
data-pipeline.ts
ETL pipeline: typed stages, checkpoint/resume, quality checks.
StarkWHAT THIS PATTERN TEACHES
How to build ETL pipelines with typed stages, validation between steps, checkpoint/resume for long-running jobs, and idempotent processing via dedup keys.
WHEN TO USE THIS
Data ingestion, transformation, migration — any multi-step data processing workflow.
AT A GLANCE
const pipeline = new Pipeline([
extract(source),
validate(schema),
transform(normalize),
load(destination),
]);
await pipeline.run({ checkpoint: true });FRAMEWORK IMPLEMENTATIONS
TypeScript
interface Stage<In, Out> {
name: string;
process: (input: In) => Promise<Out>;
validate?: (output: Out) => boolean;
}
class Pipeline {
constructor(private stages: Stage<any, any>[]) {}
async run(opts: { checkpoint?: boolean } = {}) {
let data: any = null;
for (const stage of this.stages) {
data = await stage.process(data);
if (stage.validate && !stage.validate(data)) {
throw new Error(`Validation failed: ${stage.name}`);
}
if (opts.checkpoint) await this.save(stage.name, data);
}
return data;
}