Posts
All the articles I've posted.
-
Delta Lake vs. Traditional Data Lakes: Key Differences and Vendor Options
Introduction As data-driven organizations scale their analytics and machine learning workloads, the limitations of traditional data lakes become more apparent. Delta Lake is an open-source storage
-
Why OLTP Systems Don't Retain Historical Changes
Online Transaction Processing (OLTP) systems are designed for high-speed transactions and efficient data management. However, one of their characteristics is that they do not retain historical changes
-
Understanding Slowly Changing Dimensions (SCD) in Data Warehousing
When dealing with data warehouses, handling changes in dimension data over time is crucial. Unlike operational databases where updates are straightforward, data warehouses require preserving
-
Modes and Examples of KPIs in Data Analysis Expressions (DAX)
Last Year Comparison When analyzing sales performance, it is often useful to compare the current year's sales with the same period in the previous year. To do this, we create several calculated
-
Understanding Surrogate Keys in Databases
When designing relational databases, one crucial decision is how to uniquely identify each record in a table. This is where surrogate keys come into play. Unlike natural keys, which derive from
-
Understanding the Relationship Between Database Replication and the CAP Theorem
Introduction Database replication is a fundamental strategy in distributed systems that ensures data is duplicated across multiple nodes. However, when designing a replicated database, one must
-
Understanding Pagination vs. Batch Processing in Data Handling
When working with large datasets, developers often face the challenge of efficiently extracting, processing, and managing data. Two commonly used techniques for handling such data efficiently are
-
Tracking Daily File Size Changes in SQL
When working with databases that store file metadata, it's often useful to track how file sizes change over time. If you have a table with the following structure: id | timestamp | name_file | size