Skip to content
>GLB_
Go back

How PostHog Uses ClickHouse for High-Performance Product Analytics

Modern product analytics platforms must process billions of events while still delivering low-latency queries for dashboards, funnels, and retention analysis. PostHog addresses this requirement by building its analytics engine on top of ClickHouse, a column-oriented OLAP database designed for large-scale analytical workloads.

This article focuses exclusively on how ClickHouse is used within PostHog, from data modeling to operational considerations.


Why ClickHouse Was Chosen

ClickHouse is optimized for:

These characteristics align well with PostHog’s core workload: append-only event ingestion followed by heavy analytical queries such as funnels, trends, and cohort analysis.


Data Stored in ClickHouse

In PostHog, ClickHouse is responsible for storing event data, which represents user actions such as page views, button clicks, or custom business events.

Key characteristics:

PostHog abstracts the physical schema from end users, but internally events are stored in distributed ClickHouse tables backed by sharded local tables.


Query Layer: HogQL on Top of ClickHouse

Direct SQL access to ClickHouse is possible in self-hosted deployments, but PostHog strongly encourages the use of HogQL, its own query language.

HogQL:

This design allows PostHog to evolve its internal ClickHouse schema without breaking user queries.


Performance Considerations

PostHog’s ClickHouse usage follows several core OLAP principles:

A notable optimization is the historical use of “persons on events” to reduce expensive joins between user and event tables.


Operational Responsibilities in Self-Hosted Deployments

When self-hosting PostHog, ClickHouse becomes the most critical component to operate correctly.

Key operational areas:


What ClickHouse Is Not Used For

It is important to clarify that ClickHouse in PostHog is not responsible for:

ClickHouse is used strictly as the analytical engine.


Conclusion

ClickHouse is the backbone of PostHog’s analytics capabilities. Its columnar design, compression, and query speed make it well-suited for large-scale event analytics. PostHog’s architecture leverages ClickHouse’s strengths while mitigating its limitations through append-only modeling, query abstraction (HogQL), and careful operational practices.

For teams evaluating PostHog or operating it at scale, understanding ClickHouse is essential—not as a general database, but as a purpose-built analytical engine.


Share this post:

Previous Post
Extracting and Managing Access Tokens in Postman
Next Post
Hiding Personal Information in AWS Glue with Spark