Finding K neighbours in an implicitly defined bipartite graph
tags:algorithm
incomplete
Outline
I thought of this idea while solving 1957F2. It is extremely easy, cute (at least compared to the official solution), and can be generalized, so I’ll share it here.
Problem
Given two sets of vertices $X$ and $Y$ and a boolean function $f: X \times Y \rightarrow {0, 1}$, we define a bipartite graph $G = (V = (X \cup Y), E = ((u, v) \mid u \in X, v \in Y, f(u, v) = 1))$
We need to find $k$ adjacent vertices in $G$ for every node $u \in X$.
Solution
Requirements
-
We can compute $f(u, v)$ in $O(x)$ time.
-
We can construct a data structure $D(Z)$ from some subset $Z \subseteq Y$ in $O(p(\vert Z \vert))$ time, such that we can query if there exists some $v \in Z$ such that $f(u, v) = 1$ in $O(q(\vert Z \vert))$ time.
Algorithm
Definitions
- $n = \vert X \vert$
- $m = \vert Y \vert$
- Let $Q(u, D(Z))$ be the query function which checks if there exists some $v \in Z$ such that $f(u, v) = 1$.
We will use block decomposition with $B$ as the block size. The algorithm works in the following way:
-
Divide $Y$ into $\lceil m/B \rceil$ disjoint subsets of size $B$. Let the $i$’th subset be $S_i$
-
Iterate over the subsets, while maintaining a list $L$ of all $u \in X$ for which we have found $< k$ neighbors till that point of time ($L = X$ initially).
-
At subset $S_i$, we will do the following:
-
Construct $D(S_i)$.
-
Create an empty list $H_i$, and add $u$ to $H_i$ (also delete it from $L$) for all $u \in L$ such that $Q(u, D(S_i)) = 1$.
-
Compute $f(u, v) \; \forall \; u \in H_i, v\in S_i$, and mark $v$ as a neighbor of $u$ if $f(u, v) = 1$.
-
For every $u \in H_i$ for which we haven’t found $k$ neighbors yet, add $u$ back to $L$.
-
Note
Complexity Analysis
The cumulative time for building all the data structures will be $O(\lceil m/B \rceil \cdot p(B))$.
For queries:
-
At every $S_i$, we query $Q(u, D(S_i)) \; \forall \; u \in L$. Since $\vert L \vert \leq n$, this takes $O(n \cdot \lceil m/B\rceil \cdot q(B))$ time.
-
It’s easy to see that every node $u \in X$ is added to $H_i$ for at most $k$ distinct values of $i$. Every time a $u$ is added to $H_i$, we compute $f(u, v)$ for $\leq B$ distinct values of $v$. This takes $O(n \cdot B \cdot k \cdot x)$ time.
Therefore the total time taken is:
\[O(n \cdot B \cdot k \cdot x + \lceil m/B \rceil \cdot (p(B) + n\cdot q(B)))\]The choice of $B$ is highly problem dependent, but in most setups where this idea could be used, I think we would have $k \lll n$ and $q$ would be pretty fast. I recommend solving the example problem to understand why the complexity is decent if you think it is garbage.
Example Problems
1. CF-1957F2
You are given an undirected tree of $n$ nodes. Each node $v$ has a value $a_v$ written on it. You have to answer queries related to the tree.
You are given $q$ queries. In each query, you are given 5 integers, $u_1$, $v_1$, $u_2$, $v_2$, and $k$. Denote the count of nodes with value $c$ on the path $u_1 \to v_1$ with $x_c$, and the count of nodes with value $c$ on the path $u_2 \to v_2$ with $y_c$. If there are $z$ such values of $c$ such that $x_c \neq y_c$, output any $\min(z, k)$ such values in any order.