real projects · hybrid · intermediate

Data Ingestion Pipelines

Name: Data Ingestion Pipelines
Price: 16500 THB

Move files through validation, transforms, and quarantine lanes with observability baked in.

5 weeks · 34 guided hours · weekday evenings · 16,500 THB (informational)

Tool stack

Pythonpandaspyarrow

Request information

Description

Bridgemesh pipeline labs emphasize parquet-friendly transforms, checksum gates, and bilingual column naming conventions. You will compare batch windows against Songkran holiday quiet periods to practice realistic scheduling conversations.

What is included

Deterministic transforms with unit-tested edge cases
Quarantine folders with human-readable reasons.json
Checksum and row-count reconciliation reports
Partitioning strategies for humid-market retail spikes
Lightweight orchestration without heavyweight platforms
Cost notes for cloud storage choices

Outcomes

Stand up a three-stage pipeline with documented rollback
Present metrics your data stakeholders can trust
Ship a mentor-reviewed incident replay from a failed ingest

FAQ

Is Spark included?

No—this track stays within single-node Python ergonomics.

Hardware expectations?

16GB RAM recommended for larger parquet labs; cloud notebooks available with usage caps.

Honest limitation?

We do not tune warehouse engines; focus stays on ingestion hygiene.

Experience notes

“Quarantine JSON idea landed in our finance upload lane the week after capstone.”

Iman · BI translator