The Hopcroft-Karp algorithm is a classical algorithm used to find the maximum matching in a bipartite graph. It improves upon earlier algorithms by finding matchings in O(√V * E) time, where V is the number of vertices and E is the number of edges.
A matching in a bipartite graph is a set of edges such that no two edges share a vertex. A maximum matching is a matching that contains the largest possible number of edges.
Overview of the Algorithm
The Hopcroft-Karp algorithm alternates between two main steps:
- Breadth-First Search (BFS): This step is used to find the shortest path from unmatched vertices on one side of the bipartite graph to unmatched vertices on the other side (called an augmenting path).
- Depth-First Search (DFS): This step is used to augment the matching along the augmenting paths found by BFS, effectively increasing the size of the matching.
The process is repeated until no more augmenting paths can be found.
Definitions
- A bipartite graph has two sets of vertices, say U and V, where edges only exist between vertices in different sets.
- A matching is a set of edges such that no two edges share a common vertex.
- An augmenting path is a path that alternates between edges not in the current matching and edges in the current matching, starting and ending at unmatched vertices.
Steps of the Algorithm
- Initialization:
- Start with an empty matching.
- The graph is divided into two sets U and V.
- BFS (Finding Augmenting Paths):
- Perform a BFS to find all the shortest augmenting paths from unmatched vertices in U to unmatched vertices in V.
- The idea is to find the shortest augmenting path in terms of the number of edges.
- DFS (Augment the Matching):
- Use DFS to update the matching by alternating along the augmenting paths found in the BFS step.
- If an augmenting path is found, we increase the size of the matching.
- Repeat:
- The algorithm repeats the BFS and DFS steps until no augmenting path is found, meaning the matching cannot be further augmented.
Pseudocode
pythonCopy codedef hopcroft_karp(graph, U, V):
matching = {}
dist = {}
def bfs():
queue = []
for u in U:
if u not in matching:
dist[u] = 0
queue.append(u)
else:
dist[u] = float('inf')
dist[None] = float('inf')
for u in queue:
if dist[u] < dist[None]:
for v in graph[u]:
pair_v = matching.get(v)
if dist[pair_v] == float('inf'):
dist[pair_v] = dist[u] + 1
queue.append(pair_v)
return dist[None] != float('inf')
def dfs(u):
if u is not None:
for v in graph[u]:
pair_v = matching.get(v)
if dist[pair_v] == dist[u] + 1:
if dfs(pair_v):
matching[v] = u
matching[u] = v
return True
dist[u] = float('inf')
return False
return True
matching_size = 0
while bfs():
for u in U:
if u not in matching:
if dfs(u):
matching_size += 1
return matching_size
Key Steps in the Code
- BFS Function:
- It starts from unmatched vertices in U, and tries to find the shortest augmenting paths.
- If an augmenting path is found, the
bfs()
function returnsTrue
.
- DFS Function:
- It performs a depth-first search to augment the current matching along the augmenting paths discovered by the BFS step.
- If successful, it updates the matching.
- Main Loop:
- The BFS and DFS are repeatedly called until no more augmenting paths can be found.
Example
Let’s consider a simple bipartite graph:
- U = {1, 2, 3}
- V = {A, B, C}
- Edges = {(1, A), (2, A), (2, B), (3, B), (3, C)}
- Initial Matching: Start with an empty matching.
- BFS finds the shortest augmenting paths.
- DFS augments the matching along these paths.
- Repeat until no augmenting path is found, resulting in the maximum matching.
Time Complexity
The time complexity of the Hopcroft-Karp algorithm is O(√V * E), which makes it highly efficient for large bipartite graphs.
Applications
- Job Assignment: Matching workers (vertices in U) to jobs (vertices in V).
- Maximum Flow Problems: Solving network flow problems in bipartite graphs.
- Network Design: Optimal pairing of resources and clients.
This algorithm is particularly useful when working with large graphs where maximum matching is a critical problem, such as in resource allocation, scheduling, and network routing.