In this article I will discuss how to build a cloud agnostic Big Data processing and storage solution running entirely in Kubernetes. This design avoids vendor lock-in by using only open-source technologies and avoiding cloud-managed products such as S3 and Amazon ElasticMapReduce in favour of MinIO and Apache Spark
Tag: helm
Build a Data Lake with Trino, Kubernetes, Helm, and Glue
How to create a Data Lake in AWS using S3 as the storage layer, Glue as the metastore, and Trino on Kubernetes as the query engine.