Posts
All the articles I've posted.
-
Apache Cassandra vs Apache Parquet: Understanding the Differences
In modern data architectures, it's common to encounter both Apache Cassandra and Apache Parquet , particularly when dealing with large-scale, distributed systems. Both technologies are associated with
-
Import Live Crypto Prices into Google Sheets
Are you tired of checking crypto prices manually? Want to automate your portfolio tracking or build a custom crypto dashboard? Good news — with just a few steps, you can pull live cryptocurrency
-
Fixing Spark Ivy Error in Docker: "basedir must be absolute"
If you're running Apache Spark inside Docker using Bitnami's images and suddenly encounter an Ivy error that says: Exception in thread "main" java.lang.IllegalArgumentException: basedir must be
-
How Dynamo Reshaped the Internal Architecture of Amazon S3
Introduction Amazon S3 launched in 2006 as a scalable, durable object storage system. It avoided hierarchical file systems and used flat key-based addressing from day one. However, early versions of
-
What’s Behind Amazon S3?
When you upload a file to the cloud using an app or service, there's a good chance it's being stored on Amazon S3 (Simple Storage Service). But what powers it under the hood? What is Amazon S3? Amazon
-
How HDFS Achieves Fault Tolerance Through Replication
One of the core strengths of the Hadoop Distributed File System (HDFS) is its fault tolerance . In a world of distributed computing, failures are not rare—they're expected. HDFS tackles this by using
-
Summary: Teaching HDFS Concepts to New Learners
Introducing Hadoop Distributed File System (HDFS) to newcomers can be both exciting and challenging. To make the learning experience structured and impactful, it’s helpful to break down the core
-
How Clients Know Where to Read or Write in HDFS
Hadoop Distributed File System (HDFS) is designed to decouple metadata management from actual data storage . But how does a client—like a Spark job or command-line tool—know where to read or write the