CLUST3R

Graphics · CUDA · 2026

CLUST3R

GPU-Accelerated Spatial Clustering for 3D Gaussian Splatting

Overview

CLUST3R is a GPU-accelerated spatial clustering library for 3D Gaussian Splatting scenes. It implements k-means|| — parallel k-means++ — with a spatial hash grid for O(1) neighbor queries and an adaptive rebalancing engine that merges and splits clusters post-convergence.

Designed as a preprocessing stage for the GEODE pipeline, CLUST3R partitions large 3DGS scenes into spatially coherent clusters for downstream classification and analysis. All stages run entirely on GPU — no CPU-GPU transfers during clustering. Developed at Concordia University as part of a High-Performance C++/CUDA Reading Course.

Features

K-means|| Clustering

Parallel k-means++ initialization with O(log k)-competitive guarantees. Significantly faster convergence than standard k-means initialization.

Spatial Hash Grid

O(1) neighbor queries via parallel hash grid construction using cell assignment, prefix sum, and compaction.

Adaptive Rebalancing

Automatically merges underweight clusters and splits overweight ones after convergence, ensuring balanced partitions.

Fully GPU-Accelerated

All four pipeline stages run entirely on GPU via CUDA. No CPU-GPU data transfers during clustering.

GEODE Integration

Designed as a preprocessing step for the GEODE pipeline — partitions 3DGS scenes into clusters for downstream classification.

Configurable

Tunable target cluster size, max iterations, and convergence tolerance. Scales from 50K game assets to 10M+ primitive full scenes.

Pipeline

01

Spatial Hash Grid

Assigns each Gaussian to a hash cell, computes prefix sums, and compacts the data structure for O(1) neighbor lookups.

02

K-means|| Initialization

Parallel seeding over O(log n) rounds with competitive approximation guarantees — avoids the poor initializations of naive k-means++.

03

K-means Iteration

Repeated assignment and centroid update passes until convergence tolerance is met or max iterations reached.

04

Rebalance Engine

Post-convergence merge/split pass that enforces cluster size constraints for downstream analysis compatibility.

Performance Targets

ScalePrimitivesTarget TimeMemory
Game Asset50K< 50ms< 100MB
Medium Scene500K< 500ms< 500MB
Full Scene1M< 2s< 1GB
Large Scene10M< 20s< 4GB

Usage

clust3r::ClusterConfig config;
config.target_cluster_size = 2000;
config.max_iterations      = 50;
config.convergence_tolerance = 1e-5f;

clust3r::Clusterer clusterer(config);
auto result = clusterer.cluster(positions, count);

// result.assignments  — cluster ID per primitive
// result.clusters     — vector of Cluster structs
// result.elapsed_ms   — total runtime

Requirements

CUDA Toolkit 11.0+CMake 3.18+C++17CUB Library