Article’s

MINI LAKEHOUSE ON DUCKDB + LOOKER STUDIO: A DATA ENGINEERING PIPELINE

Saniya Shafi Ahmed Shaikh

(11 – 2025)

DOI: 10.5281/zenodo.17680961

 

Small organizations often rely on messy, inconsistent spreadsheets that limit analytics quality. This paper presents a Mini Lakehouse architecture built using DuckDB, Parquet, and Python, with Looker Studio dashboards. The workflow ingests Excel data, performs structured cleaning, builds a star schema, enforces data quality checks, and stores curated data in Parquet/DuckDB. This fully local, low cost pipeline provides reproducible, auditable analytics without cloud infrastructure. The result is a governed, BI ready system suitable for small teams and academic environments.

 

 

Scroll to Top