Divide and conquer… the ‘if-then’ rules in your data

FCA

Algorithm

Implications

Divide-and-conquer

Knowledge representation

The ‘CARVE’ algorithm is a super-fast ‘divide-and-conquer’ method for finding data concepts. We upgraded it to ‘CARVE+’ so it can find all the ‘if-then’ logical rules at the same time, giving us the best of both worlds.

Author

Domingo López-Rodríguez, Manuel Ojeda-Hernández, Tim Pattison

Published

19 April 2025

In Formal Concept Analysis (FCA), there are two main treasures we hunt for: 1. The Concepts: The hidden groups and clusters in the data. 2. The Implications: The “if-then” rules that govern the data’s logic.

Finding these is computationally hard. A few years ago, the CARVE algorithm was created as a powerful “divide-and-conquer” strategy to find the concepts much faster. It works by breaking a big problem into smaller pieces, solving them, and stitching the results back together.

But it had one major limitation: it only worked for the concepts. It couldn’t find the implications.

In our 2025 paper accepted to Knowledge-Based Systems (Q1), we present the solution: an upgraded algorithm that does both.

🧐 The problem: a tale of two algorithms

This created an annoying choice for data scientists. You had to: * Use the fast CARVE algorithm to get the concepts. * …but if you also wanted the implications, you had to run a different, slow, “monolithic” algorithm (like LinCbO or NextClosure) on the whole dataset.

You couldn’t get both speed and full knowledge at the same time. We wanted to fix that.

💡 Our solution: “CARVE+” for implications

We asked: can we “augment” the fast, divide-and-conquer logic of CARVE so that it finds the implications as it goes?

The answer is yes, but the “stitching together” part (the synthesis phase) is incredibly tricky. It’s not as simple as just merging the lists of rules from the small pieces. A rule that’s true in a small piece might not be true (or might be redundant) when you look at the whole picture.

Our main contribution is a new set of mathematical rules for this synthesis phase. We formally proved that our new rules can correctly and completely reconstruct the entire set of implications for the original dataset from the implications of its sub-parts.

We implemented this new, refined method in a novel algorithm we call CARVE+.

A conceptual image showing a complex rulebook being ‘divided’ into small pages and ‘conquered’ by reassembling them.

* Our ‘CARVE+’ algorithm applies a divide-and-conquer strategy not just to concepts, but to the logical rules (implications) themselves.

🚀 The results: fast, sound, and complete

We proved our CARVE+ algorithm is sound and complete—it finds the exact, correct set of all implications, with no extras and none missing.

And the best part? It’s fast. We benchmarked CARVE+ against the classic, monolithic algorithms. The results show that our divide-and-conquer approach “compares favourably,” especially on the kind of large, structured data where CARVE shines.

🔬 Why does this matter?

This work removes the “either/or” choice data scientists had to make. You no longer have to choose between finding concepts or implications, or between a fast algorithm or a complete one.

CARVE+ provides a unified, high-performance “divide-and-conquer” engine to extract all the knowledge (both concepts and logical rules) from a dataset. This makes large-scale knowledge discovery and logical rule mining feasible for datasets that were previously too complex to handle.

📖 The full paper

For the complete technical breakdown, the new synthesis rules, and the full experimental comparison, you can read the original journal article.

Systems of implications obtained using the CARVE decomposition of a formal context. Authors: Domingo López-Rodríguez, Manuel Ojeda-Hernández, Tim Pattison. Journal: Knowledge-Based Systems (vol. 318, 113475)

[DOI Link] | [Article Website]