Posts
All the articles I've posted.
-
Estimating the Cost of an AWS Glue Workflow
When working with AWS Glue, one of the most common questions data engineers ask is: How much will this job cost me? If you have a workflow that runs for 13 minutes, understanding the cost model of AWS
-
Modern Table Formats: Iceberg, Delta Lake, and Hudi
Data Lakes made it possible to store raw data at scale, but they lacked the reliability and governance of data warehouses. Files could be dropped into storage (S3, HDFS, MinIO), but analysts struggled
-
Running Production Servers on AWS: EC2 vs RDS Cost Breakdown
When planning to run production workloads in the cloud, cost is one of the most important considerations. In this post, we will explore the monthly expenses of running two application servers and a
-
Trino in Modern Architectures: SQL Queries on S3 and MinIO
The rise of cloud object storage has transformed how organizations build data platforms. Hadoop Distributed File System (HDFS) once dominated, but today services like Amazon S3, Google Cloud Storage
-
Hive Metastore: The Glue Holding Big Data Together
When people think of Hive, they often remember the early days of Hadoop and MapReduce. But while Hive as a query engine has largely faded, one of its components remains critical to the modern data
-
Why Parquet Became the Standard for Analytics
In the early days of Big Data, data was often stored in simple formats such as CSV, JSON, or text logs. While these formats were easy to generate and understand, they quickly became inefficient at
-
Facebook and Big Data: The Open Source Projects That Changed the Industry
When people talk about the history of Big Data, a few companies come to mind: Google, Yahoo, and Facebook. Each of them faced unique challenges that forced them to build large-scale distributed
-
HDFS vs. Object Storage: The Battle for Distributed Storage
Distributed storage has always been the foundation of Big Data. In the early days, Hadoop Distributed File System (HDFS) was the de facto standard. Today, however, object storage systems like Amazon