Problem statement: you are given a sorted array of positive integers, you have to identify k-th non-present positive integer element.
ย
Example:
array = [1, 2, 3, 5, 7, 8, 15]
^ missing 4
^ missing 6
^ missing 9, 10, 11, 12, 13, 14
ย
Note that by taking k-th element and subtracting its index, we get the number of skipped elements up to this point (plus one):
ย
array[0] - 0 = 1 # 0 elements skipped + 1
array[1] - 1 = 1 # the same
array[3] - 3 = 5 - 3 = 2 # element `4` is skipped + 1
array[6] - 6 = 15 - 6 = 9 # 8 elements skipped + 1
ย
Therefore, the sequence is non-decreasing. By using binary search, we can identify the left and right boundaries for the property โthe number of skipped elements is less than or equal to kโ.
ย
def find_kth_nonpresent(array: list[int], k: int) -> int:
left = -1
right = len(array)
while right - left > 1:
mid = (left + right) // 2
if array[mid] - mid - 1 < k:
left = mid
else:
right = mid
# discussion continues
ย
In the above code, we make sure that
- the
left
boundary satisfies the search condition. If we mathematically suppose that minus first element is , then we have no skipped elements, which is less than k.
- the
right
boundary does not satisfy the search condition. Indeed, appending a positive infinity to the array, we have an infinite amount of skipped elements.
ย
At the end of this piece, we have the
left
and right
positions so that at left
, there are less than k
numbers skipped, and at right
there are at least k
numbers skipped.ย
From this information we can infer how many numbers were skipped at position
left
and how much do we need to add to get k
th number.# continuation of find_kth_nonpresent(array)
if left == -1:
return k
n_elements_skipped = array[left] - left - 1
return array[left] + (k - n_elements_skipped)
ย
Inequality: strict or non-strict?
When constructing a rule for the binary search, we need to decide whether we are going to have a strict or a non-strict inequality for the number of skipped elements. This has to be decided case-by-case. The idea is to squeeze the k-th missing number between
array[left]
and array[right]
. If the inequality were non-strict, we could have skipped exactly k numbers at position left
and so, the k-th missing element would be positioned before left.ย
Debugging remarks
At the first go, I did not manage to get the formula right. In the search condition, I used
array[mid] - mid + 1
instead of array[mid] - mid - 1
. Sadly, the same pattern is reused for calculation of the number of elements skipped, so I replicated this bug twice due to code duplication. It is more reasonable to write a helper function, even though the overall solution is quite short, and use it for the property verification, as well as for further computation.This modification makes the binary search more readable as well.
def n_elements_skipped(array: list[int], position: int) -> int:
return array[position] - position - 1
def find_kth_nonpresent(array: list[int], k: int) -> int:
left = -1
right = len(array)
while right - left > 1:
mid = (left + right) // 2
if n_elements_skipped(array, mid) < k: # better
left = mid
else:
right = mid
if left == -1:
return k
return array[left] + (k - n_elements_skipped(array, left))