执行交换操作后的最小汉明距离

难度:

标签:

题目描述

代码结果

运行时间: 226 ms, 内存: 51.2 MB

/*
 * Problem: Given two integer arrays source and target of length n, and an array allowedSwaps where each allowedSwaps[i] = [ai, bi] indicates that you can swap the elements at indices ai and bi in the source array any number of times and in any order.
 * The Hamming distance between two arrays of the same length is the number of positions where the elements are different.
 * Return the minimum Hamming distance after performing any number of allowed swaps on the source array.
 */

import java.util.*;
import java.util.stream.*;

public class Solution {
    public int minimumHammingDistance(int[] source, int[] target, int[][] allowedSwaps) {
        int n = source.length;
        UnionFind uf = new UnionFind(n);
        Arrays.stream(allowedSwaps).forEach(swap -> uf.union(swap[0], swap[1]));
        Map<Integer, Map<Integer, Long>> componentMap = new HashMap<>();
        IntStream.range(0, n).forEach(i -> {
            int root = uf.find(i);
            componentMap.computeIfAbsent(root, k -> new HashMap<>()).merge(source[i], 1L, Long::sum);
        });
        return (int) IntStream.range(0, n).filter(i -> {
            int root = uf.find(i);
            Map<Integer, Long> component = componentMap.get(root);
            if (component.getOrDefault(target[i], 0L) > 0) {
                component.put(target[i], component.get(target[i]) - 1);
                return false;
            }
            return true;
        }).count();
    }

    class UnionFind {
        private int[] parent, rank;

        public UnionFind(int size) {
            parent = new int[size];
            rank = new int[size];
            IntStream.range(0, size).forEach(i -> {
                parent[i] = i;
                rank[i] = 1;
            });
        }

        public int find(int x) {
            if (parent[x] != x) {
                parent[x] = find(parent[x]);
            }
            return parent[x];
        }

        public void union(int x, int y) {
            int rootX = find(x);
            int rootY = find(y);
            if (rootX != rootY) {
                if (rank[rootX] > rank[rootY]) {
                    parent[rootY] = rootX;
                } else if (rank[rootX] < rank[rootY]) {
                    parent[rootX] = rootY;
                } else {
                    parent[rootY] = rootX;
                    rank[rootX]++;
                }
            }
        }
    }
}

解释

方法:

这个题解使用了并查集（union-find）数据结构来管理元素交换的可能性。首先，通过allowedSwaps提供的索引对，构建并查集，以此确定source中哪些元素可以通过交换被重新组织。接着，使用一个映射（字典）来记录每个并查集根节点下所有元素的出现频率。最后，遍历target数组，并尝试从相对应的并查集根节点的字典中删除target中的元素。如果在字典中找不到相应元素，汉明距离就增加1。这样，最终的汉明距离就是source和target在所有允许的交换操作后，仍然不匹配的元素总数。

时间复杂度:

O(n)

空间复杂度:

O(n)

代码细节讲解

🦆

并查集在这个问题中如何确保元素可以自由交换？它是如何处理多个允许交换的组合的？

▷

并查集通过将可以互相交换的元素组织成一个集合来确保这些元素可以自由重组。每次调用union操作，它将两个元素的根节点合并，这样所有连接的元素最终都会共享同一个根节点。当多个交换操作涉及相互连接的元素时，这些元素通过不断的union操作最终都会归属于同一个根节点下，从而形成一个可以自由交换元素的大集合。

🦆

题解中提到构建元素频率字典，这个字典是在并查集的哪个阶段构建的？为什么选择在那个特定的时刻构建？

▷

元素频率字典是在构建并查集之后、计算汉明距离之前构建的。在所有的union操作完成后，每个元素都已经被分配到一个确定的集合中，此时构建频率字典可以准确地记录下每个集合中不同元素的出现次数。这样做的原因是，只有在集合构建完成后，我们才能得到每个元素最终所属的集合，进而正确统计和利用这些信息来与target数组进行比较和匹配。

🦆

如何处理当source和target数组中存在重复元素时的匹配问题？

▷

题解中通过构建每个集合的元素频率字典来处理重复元素的匹配问题。这个字典记录了每个集合中每种元素出现的次数。在计算汉明距离时，对于target中的每个元素，程序会尝试从相应集合的字典中减少这个元素的计数。如果字典中该元素的计数已经为零或者元素不存在于字典中，就增加汉明距离。这种方法确保了即使在source和target中元素有重复时，也能正确地匹配相同元素的数量，只有无法匹配的元素才会增加汉明距离。

🦆

在计算最小汉明距离时，如果target中的元素在对应的并查集组中不存在，为什么直接增加汉明距离？会有什么潜在的遗漏吗？

▷

当target中的元素在对应的并查集组中不存在时，直接增加汉明距离是因为这表示source数组中的元素，即使经过所有允许的交换，仍无法匹配target中的该元素。增加汉明距离是为了反映这种不匹配的情况。潜在的遗漏可能发生在错误地管理并查集或元素频率字典时，例如并查集的合并操作没有正确执行，或者频率统计有误。但如果这些数据结构和操作都正确无误，直接增加汉明距离是合理且必要的，因为它确实反映了两个数组在经过所有可能的合法交换后的差异。

执行交换操作后的最小汉明距离

题目描述

代码结果

解释

代码细节讲解

相关问题