Union-find
Table of contents
Introduction
A union-find data structure (also known as a disjoint set), is used to efficiently store items in disjoint (non-overlapping) sets.
The key union-find operations are:
make_set(x)
: make set withx
as a member of its set.find(x)
: return the root element ofx
.union(x,y)
: combine the sets ofx
andy
.
Union-find represents sets as a backwards tree, where each node has a corresponding parent. The root node defines which set the node and all its children are in.
A root node has its parent set to itself, to identify that it’s the root in a set [1, P. 199].
Merging sets can be done by setting the parent of one set to the parent of another. In the case of a merge, the smallest subtree becomes the parent of the other [1, P. 199].
At most doublings can occur before all sets have been merged. That means find()
and union()
are both operations [1, P. 201].
Representing a union-find
A union-find can be represented as an array of nodes:
typedef struct {
int p[SET_SIZE+1]; // parent element
int size[SET_SIZE+1]; // number of elements in subtree i
int n; // number of elements in set
} set_union;
The following are implementations of the key operations:
void set_union_make_set(set_union *s, int x) {
s->p[i] = i;
s->size[i] = 1;
}
void set_union_init(set_union *s, int n) {
for (int i = 1; i <= n; i++) {
set_union_make_set(s, i)
}
s->n = n;
}
int set_union_find(set_union *s, int x) {
if (s->p[x] == x) {
return(x);
}
return set_union_find(s, s->p[x]);
}
int set_union_union(set_union *s, int s1, int s2) {
int r1, r2; /* roots of sets */
r1 = find(s, s1);
r2 = find(s, s2);
if (r1 == r2) return;
if (s->size[r1] >= s->size[r2]) {
s->size[r1] = s->size[r1] + s->size[r2];
s->p[r2] = r1;
}
else {
s->size[r2] = s->size[r1] + s->size[r2];
s->p[r1] = r2;
}
}
References
- [1] S. Skiena, The Algorithm Design Manual, 2nd ed. Springer, 2008.