Table of Contents
1. The production problem
The platform uses order history to discover product sets that are frequently bought together. This is the kind of backend workflow that looks fine in development, works on small datasets, and then suddenly collapses when a real customer brings enough scale.
Production logs showed the worst case clearly: around 6,300 orders and 1,500 unique products. On that input, the existing job never completed. No completion meant no generated sets and no downstream value.
That is the kind of bug we like most: not a cosmetic fix, but a chance to remove a structural limit.
2. Why Apriori failed
The existing implementation depended on Accord.NET's Apriori algorithm. Apriori works by generating candidate itemsets level by level, which becomes brutally expensive when the number of unique items rises. In this case, the theoretical shape of the problem was the real problem: combinatorial explosion.
Combinatorial complexity
Apriori scales with the candidate space. With enough unique products, that search space explodes.
Cancellation that did not cancel
The library exposed a CancellationToken, but the token was not checked inside the hot loops. In practice, cancellation support was dead code.
Extra overhead everywhere
There was conversion overhead, expensive lookups, and an indirect route through association rules just to get back to the frequent itemsets we actually needed.
The lesson was simple: this was not a tuning problem. It needed a different algorithm.
We also considered contributing a fix upstream. The plan was to open a pull request with an updated approach for Accord.NET, but the project repository is archived and no longer accepts active development. Because of that, we implemented and shipped the fix in our own stack.
3. What we shipped
We replaced Apriori with a custom FP-Growth implementation. FP-Growth avoids candidate generation, compresses transactions into an FP-Tree, and mines frequent itemsets directly through pattern growth. That makes it a much better fit for this workload.
What changed in the codebase
- A clean boundary between the business flow and the mining algorithm was introduced.
- The old per-algorithm flow was replaced with one unified, dependency-injected execution path.
- The algorithm was isolated into a dedicated machine-learning module.
- We added unified correctness, cancellation, and scale tests across all implementations.
- We benchmarked multiple FP-Growth variants and shipped V5 as the production winner.
The nicest part of this PR is that it does not just hide the old bottleneck. It makes the algorithm swappable, benchmarkable, and much easier to reason about in the future.
4. The benchmark results
BenchmarkDotNet runs on an Apple M4 with .NET 8 made the impact obvious. The old approach timed out on large data. The new one finished in milliseconds.
Small benchmark
1k orders / 100 products
Accord Apriori: 1,556 ms
FP-Growth V5: 1.9 ms
Large benchmark
10k orders / 2k products
Accord Apriori: timeout
FP-Growth V5: 11.4 ms
We also iterated through several internal variants. V5 won because it stored nodes in a flat list and used one shared dictionary per tree keyed by (parentId << 32 | item), which removed the per-node dictionary cost that limited earlier versions on large trees.
Correctness did not get traded away for speed. The PR kept all variants aligned against the reference behavior and finished with 44 passing tests.
5. Why this matters
Good performance work is rarely about micro-optimizing a hot loop in isolation. The real value comes from stepping back, verifying the bottleneck, and changing the shape of the solution when the current one is wrong.
That is what makes this change worth talking about. It fixed a production issue, simplified the architecture, improved observability, and left the system in a state where future algorithm work can be measured instead of guessed.
We intentionally keep implementation details compact here, but the key outcome is simple: the same workflow now scales reliably on large datasets.
Sitting on a backend job that times out in production?
We like ugly bottlenecks: old libraries, algorithmic dead ends, slow batch jobs, and systems that only break at scale. If that sounds familiar, let's take a look.
