Posts
All the articles I've posted.
-
The History of Hive and Trino: From Hadoop to Lakehouses
The evolution of Big Data architectures is deeply tied to the history of two projects born at Facebook: Hive and Trino . Both emerged from real engineering pain points, but at different times and for
-
What Is a Data Lake and What Is a Data Lakehouse?
Over the last decade, the world of data architecture has gone through several transformations. From traditional data warehouses to Hadoop-based data lakes and now to the emerging Lakehouse paradigm,
-
Google Bigtable vs. Amazon DynamoDB: Understanding the Differences
When choosing a NoSQL database for scalable, low-latency applications, two major options stand out: Google Cloud Bigtable and Amazon DynamoDB . While both are managed, highly available, and
-
How to Keep a Docker Container Running Persistently
When working with Docker, you may have noticed that some containers stop as soon as you exit the shell. This is because Docker considers the container's main process to have finished. In this post, we
-
Fixing Cursor Login Issues on Linux (AppImage)
When running Cursor on Linux, especially with the AppImage version, you might encounter a situation where you can’t log in. This usually happens because Cursor stores its session state locally, and
-
Managing Evolving Schemas in Apache Spark: A Strategic Approach
Schema management is one of the most overlooked yet critical aspects of building reliable data pipelines. In a fast-moving environment, schemas rarely remain static: new fields are added, data types
-
Orchestrating Multiple AWS Glue Workflows: A Practical Guide
AWS Glue provides a robust environment for building and managing ETL pipelines, but many data engineers face the challenge of chaining or coordinating multiple workflows . This article explores
-
Secure Ways to Share Private Data on AWS: Beyond Public Buckets
When building data platforms in the cloud, it is common to share data with partners, clients, or internal teams outside your own. AWS provides several mechanisms to grant secure, granular access — far