Let's say we have sequence A
and subsequence of A
, called B
. I need to determine which elements of sequence A
could be potentially used to construct subsequence B
.
For example, assume A = [1, 3, 2, 1, 2, 3, 1, 3, 2]
and B = [1, 3, 1, 2]
To construct B
from A
, we could use elements at indexes:
- 1, 2, 4, 5
- 1, 2, 4, 9
- 1, 2, 7, 9
- 1, 6, 7, 9
So our answer should look like [true, true, false, true, true, true, true, false, true]
, where true
means that it's possible that element at i-th
index was used to construct subsequence B
.
It would be easy to solve with O(2^N * M)
time complexity (generate all subsequences of A
and compare them with B
). But I need solution that works in something like O(NlogN)
.
I figured out solution that seems working when all A
elements (so B
also) are unique.
I iterate over A
, and for every i-th
element I check whether:
A[i]
exists inB
, otherwise we can discard thisi-th
element- Let's say
j
is equal to index whereA[i]
was found inB
. We check whetheri >= j
, otherwise we can discard thisi-th
element, because ifj > i
, then our subsequence won't maintain order - We check if any neigbhour of
B[j]
, soB[j - 1]
orB[j + 1]
is at the right side ofA[i]
(all elements whose indexes are greater thani
). Now, because the elements are unique, we can say that this specificA[i]
element must have been used at someB[j]
position, because there will always be oneA
subsequence that is equal toB
.
That's how it could look in code:
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
bool val_exists(const vector<pair<int, int>>& A_val_sorted, int val, int min_idx) {
if (val == -1) {
return true;
}
if (min_idx == A_val_sorted.size())
return false;
int l = 0;
int r = A_val_sorted.size() - 1;
while (l <= r) {
int mid = (l + r) / 2;
if (A_val_sorted[mid].first == val) {
if (A_val_sorted[mid].second >= min_idx)
return true;
l = mid + 1;
}
else if (A_val_sorted[mid].first < val) {
l = mid + 1;
}
else {
r = mid - 1;
}
}
return false;
}
int main()
{
int n, m;
cin >> n >> m;
// val, original index
vector<pair<int, int>> A_val_sorted(n);
vector<pair<int, int>> B_val_sorted(m);
vector<int> A(m);
for (int i = 0; i < n; ++i) {
cin >> A_val_sorted[i].first;
A_val_sorted[i].second = i;
}
for (int i = 0; i < m; ++i) {
cin >> B_val_sorted[i].first;
B_val_sorted[i].second = i;
A[i] = B_val_sorted[i].first;
}
sort(A_val_sorted.begin(), A_val_sorted.end());
sort(B_val_sorted.begin(), B_val_sorted.end());
vector<int> result(n);
for (const pair<int, int>& b : A_val_sorted) {
int val = b.first;
int idx = b.second;
int l = 0;
int r = m - 1;
bool valid = false;
while (l <= r) {
int mid = (l + r) / 2;
if (B_val_sorted[mid].first == val) {
int right_ngbr_idx = B_val_sorted[mid].second + 1;
if (B_val_sorted[mid].second <= idx && val_exists(A_val_sorted, (right_ngbr_idx < A.size() ? A[right_ngbr_idx] : -1), idx + 1)) {
valid = true;
break;
}
r = mid - 1;
}
else if (B_val_sorted[mid].first < val) {
l = mid + 1;
}
else {
r = mid - 1;
}
}
result[idx] = valid;
}
for (int i = 0; i < n; ++i)
std::cout << result[i] << ' ';
}
I know, for the case where elements are unique, it can be done much easier. But this approach provides idea on how to solve this problem when elements are non-unique.
Instead of checking one of the neigbhours, we can just check whole left and right part from B[j]
. Lets say N = A.size()
and M = B.size()
. This solution leads to O(NlogN + MlogM + N*M*logM)
, because we can sort the subsequences, store the original indexes etc. and use binary search to find A[i]
in B
.
It's too slow. How can I solve it for big sequences, like N = 10^5
?