Tag: Data
All the articles with the tag "Data".
-
Batch Means Two Different Things: Why the Term Became Confusing in Data Engineering
In data systems, some of the most common words are also the most overloaded. Few terms illustrate this better than batch . Historically, batch processing described a very specific operating model:
-
Hardening OAuth Token Management in Postman: Preventing Environment Cross-Contamination
When working with multiple third-party APIs (Zoom, HubSpot, Meta, etc.), a common operational risk in Postman is environment cross-contamination . Tokens may be overwritten unintentionally if the
-
Can You Know the Location of an IPv6 Address?
Example IPv6: 2600:100e:b0c7:7403:f88c:92d0:bc41:46ff Short answer: only approximately , and with significant limitations. This article explains what can and cannot be inferred from an IPv6 address,
-
AWS Glue + Chargebee: Diagnosing CERTIFICATE_VERIFY_FAILED After TLS Chain Updates
Context An AWS Glue job that consumes the Chargebee API begins failing with: SSLError: SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate The same
-
Why You Can’t Get Full Social Analytics from the HubSpot API (Even with Marketing Hub Pro)
Many teams assume that upgrading to Marketing Hub Professional unlocks full programmatic access to social media performance metrics. It does not. This article clarifies what is technically possible,
-
Hiding Personal Information in AWS Glue with Spark
Protecting personal data before analytics consumption is a core requirement in modern data platforms. In AWS-based lake architectures, this is typically achieved through data de-identification during
-
Modern Table Formats: Iceberg, Delta Lake, and Hudi
Data Lakes made it possible to store raw data at scale, but they lacked the reliability and governance of data warehouses. Files could be dropped into storage (S3, HDFS, MinIO), but analysts struggled
-
Trino in Modern Architectures: SQL Queries on S3 and MinIO
The rise of cloud object storage has transformed how organizations build data platforms. Hadoop Distributed File System (HDFS) once dominated, but today services like Amazon S3, Google Cloud Storage