Union-find

Table of contents

  1. Introduction
  2. Representing a union-find
  3. References

Introduction

A union-find data structure (also known as a disjoint set), is used to efficiently store items in disjoint (non-overlapping) sets.

The key union-find operations are:

  • make_set(x): make set with x as a member of its set.
  • find(x): return the root element of x.
  • union(x,y): combine the sets of x and y.

Union-find represents sets as a backwards tree, where each node has a corresponding parent. The root node defines which set the node and all its children are in.

A root node has its parent set to itself, to identify that it’s the root in a set. [1, P. 199].

Figure: Two sets and their representation in an array

Merging sets can be done by setting the parent of one set to the parent of another. In the case of a merge, the smallest subtree becomes the parent of the other [1, P. 199].

At most lg2n doublings can occur before all sets have been merged. That means find and union are both O(log n) operations [1, P. 201].

Representing a union-find

A union-find can be represented as an array of nodes:

typedef struct {
  int p[SET_SIZE+1]; // parent element
  int size[SET_SIZE+1]; // number of elements in subtree i
  int n; // number of elements in set
} set_union;

The following are implementations of the key operations:


void set_union_make_set(set_union *s, int x) {
  s->p[i] = i;
  s->size[i] = 1;
}

void set_union_init(set_union *s, int n) {
  for (int i = 1; i <= n; i++) {
    set_union_make_set(s, i)
  }
  s->n = n;
}

int set_union_find(set_union *s, int x) {
  if (s->p[x] == x) {
    return(x);
  }
  return set_union_find(s, s->p[x]);
}

int set_union_union(set_union *s, int s1, int s2) {
  int r1, r2; /* roots of sets */
  r1 = find(s, s1);
  r2 = find(s, s2);
  if (r1 == r2) return;

  if (s->size[r1] >= s->size[r2]) {
    s->size[r1] = s->size[r1] + s->size[r2];
    s->p[r2] = r1;
  }
  else {
    s->size[r2] = s->size[r1] + s->size[r2];
    s->p[r1] = r2;
  }
}

[1, P. 200]

References

  1. [1] S. Skiena, The Algorithm Design Manual, 2nd ed. Springer, 2008.