Posts
All the articles I've posted.
-
When Should You Use Iceberg with Athena? Partitioning Strategies and Best Practices
As data lakes grow in size and complexity, tools like Amazon Athena combined with table formats like Apache Iceberg become essential for scalability, data governance, and performance. In this post,
-
Why You Should Use the -out Option with terraform plan
When working with Terraform, a common workflow involves running terraform plan followed by terraform apply . However, you may have come across the following warning: "You didn't use the -out option to
-
How Google Changed Big Data: The Story of GFS, MapReduce, and Bigtable
In the early 2000s, Google faced a unique challenge: how to store, process, and query massive amounts of data across thousands of unreliable machines. The traditional systems of the time—designed for
-
ecure Database Access in AWS Using SSH Tunneling
Accessing databases located in private subnets within AWS Virtual Private Clouds (VPCs) is a common requirement in enterprise architectures. To ensure secure connectivity without exposing the database
-
Did Early Personal Computers Really Have a CPU? A Look at the von Neumann Architecture
When we think of a personal computer (PC), we typically imagine a processor, memory, a keyboard, and a display. But a deeper question often goes unasked: Did all early personal computers actually
-
Mastering the Linux find Command: A Practical Introduction
When working with Linux, one of the most powerful tools at your disposal is the find command. Whether you're managing a personal machine or maintaining a production server, being able to locate files
-
The Origin and Evolution of the DataFrame
When working with data today—whether in Python, R, or distributed computing platforms like Spark—one of the most commonly used structures is the DataFrame . But where did it come from? This post
-
Understanding ORM: Bridging the Gap Between Objects and Relational Databases
In modern software development, working with databases is a fundamental requirement. Most applications need to persist, retrieve, and manipulate data stored in relational databases such as PostgreSQL,