The one who doesn’t match

A very common C interview problem is the following:

In an array with n positive integers, all values appear twice with the exception of one, that appears only once. Can we find it using constant space?

The usual trick is to use bitwise XOR in the following way,

```unsigned answer1(const unsigned *a, size_t n)
{
unsigned ret = 0;
for (size_t i = 0; i < n; i++)
ret ^= a[i];
return ret;
}
```

and it works due to the properties `a^b == b^a`, `a^(b^c) == (a^b)^c`, `a^0 == a` and `a^a == 0` (following the usual conventions, we assume that constant space refers to a constant number of computer words).

It is fairly straightforward to extend that solution to

In an array with n positive integers, all values appear k times with the exception of one, that appears only once. How much space and time do we need to find it?

(hint: XOR can be seen as addition modulo 2)

but what extensions are possible? If we generalize it to

In an array with n positive integers, all values appear k times with the exception of m values, that appear p times. How much space and time doe we need to find them?

a trivial upper bound is O(n) space and O(n log n) time, achievable by just sorting and counting. We know we can do better in some cases, but how can we characterize them?

Some answers in the next post (hopefully it won’t take one year to arrive!).