最长相邻不相等子序列 II

难度:

标签:

题目描述

You are given a string array words, and an array groups, both arrays having length n.

The hamming distance between two strings of equal length is the number of positions at which the corresponding characters are different.

You need to select the longest subsequence from an array of indices [0, 1, ..., n - 1], such that for the subsequence denoted as [i₀, i₁, ..., i_k-1] having length k, the following holds:

For adjacent indices in the subsequence, their corresponding groups are unequal, i.e., groups[i_j] != groups[i_j+1], for each j where 0 < j + 1 < k.
words[i_j] and words[i_j+1] are equal in length, and the hamming distance between them is 1, where 0 < j + 1 < k, for all indices in the subsequence.

Return a string array containing the words corresponding to the indices (in order) in the selected subsequence. If there are multiple answers, return any of them.

Note: strings in words may be unequal in length.

Example 1:

Input: words = ["bab","dab","cab"], groups = [1,2,2]

Output: ["bab","cab"]

Explanation: A subsequence that can be selected is [0,2].

groups[0] != groups[2]
words[0].length == words[2].length, and the hamming distance between them is 1.

So, a valid answer is [words[0],words[2]] = ["bab","cab"].

Another subsequence that can be selected is [0,1].

groups[0] != groups[1]
words[0].length == words[1].length, and the hamming distance between them is 1.

So, another valid answer is [words[0],words[1]] = ["bab","dab"].

It can be shown that the length of the longest subsequence of indices that satisfies the conditions is 2.

Example 2:

Input: words = ["a","b","c","d"], groups = [1,2,3,4]

Output: ["a","b","c","d"]

Explanation: We can select the subsequence [0,1,2,3].

It satisfies both conditions.

Hence, the answer is [words[0],words[1],words[2],words[3]] = ["a","b","c","d"].

It has the longest length among all subsequences of indices that satisfy the conditions.

Hence, it is the only answer.

Constraints:

1 <= n == words.length == groups.length <= 1000
1 <= words[i].length <= 10
1 <= groups[i] <= n
words consists of distinct strings.
words[i] consists of lowercase English letters.

代码结果

运行时间: 481 ms, 内存: 16.3 MB

/*
 * 思路：
 * 1. 使用 Java Stream 处理数据。
 * 2. 使用 filter 过滤满足条件的组合。
 * 3. 使用 collect 收集最终的结果。
 */
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

public class SolutionStream {
    public List<String> findLongestSubsequence(int n, String[] words, int[] groups) {
        return IntStream.range(0, n)
            .boxed()
            .flatMap(i -> IntStream.range(i + 1, n)
                    .filter(j -> isValidPair(words[i], words[j], groups[i], groups[j]))
                    .mapToObj(j -> new int[]{i, j}))
            .map(pair -> new ArrayList<String>() {{
                add(words[pair[0]]);
                add(words[pair[1]]);
            }})
            .max((a, b) -> Integer.compare(a.size(), b.size()))
            .orElse(new ArrayList<>());
    }

    private boolean isValidPair(String word1, String word2, int group1, int group2) {
        if (group1 == group2 || word1.length() != word2.length()) return false;
        int hammingDistance = 0;
        for (int i = 0; i < word1.length(); i++) {
            if (word1.charAt(i) != word2.charAt(i)) {
                hammingDistance++;
            }
            if (hammingDistance > 1) return false;
        }
        return hammingDistance == 1;
    }
}

解释

方法:

这个题解采用动态规划的方法来解决问题。定义 dp 数组 f[i] 为从索引 i 开始到 n-1，满足题目条件的最长子序列的长度。数组 from_idx 用来记录选择了哪个后续索引来形成最长子序列，便于最后重建答案。首先，需要一个辅助函数 ok(s, t) 来判断两个字符串的汉明距离是否为 1，即长度相同且恰好有一个字符不同。遍历每个单词，对于每个单词 i，再遍历所有可能的下一个单词 j (i < j)，检查是否满足 groups 值不同以及汉明距离为 1 的条件。如果满足，更新 f[i]。遍历完成后，通过 f 数组找到最长子序列的起始索引，然后使用 from_idx 数组重建整个子序列。

时间复杂度:

O(n^2 * L)

空间复杂度:

O(n)

代码细节讲解

🦆

题解中提到的动态规划数组f[i]是如何初始化的，以及为什么将其所有元素初值设为0？

▷

在动态规划中，数组 f[i] 用来记录从索引 i 开始的最长子序列的长度。初始化所有元素为 0 是因为在开始时，我们假设没有任何子序列可以从每个单词开始，即每个单词至少可以构成长度为1的子序列。在遍历过程中，f[i] 将根据条件逐渐更新，而初始化为 0 是一个安全的底线，确保在后续逻辑中，任何有效的子序列都会比初始值大，从而被正确更新。

🦆

在动态规划的遍历过程中，为什么选择从后向前遍历数组来更新f[i]的值？

▷

从后向前遍历数组来更新 f[i] 的值能确保当更新 f[i] 时，所有可能的后续元素 f[j] (对于所有 j > i) 已经被计算过，因此可以直接使用它们的值来决定当前 f[i] 的最优值。这种从后向前的顺序是动态规划中常用的技巧，可以确保每次更新依赖的状态已经事先计算并存储好，从而有效地避免重复计算和潜在的错误。

🦆

动态规划解法中，from_idx数组的作用是什么，它如何帮助重建最长子序列？

▷

在动态规划中，from_idx 数组用来记录每个位置 i 在构建最长子序列时选择的下一个词的索引 j。这样，在最终确定了最长子序列的起始点后，可以通过 from_idx 数组来追溯整个子序列的构成。具体地，从记录的起始点开始，通过连续访问 from_idx 指示的下一个索引，直至构建出整个序列。这种方式允许我们在不重新计算的情况下，直接通过记录的路径重建整个子序列。

🦆

题解中的ok函数用来计算汉明距离，但在实现中是否考虑了字符串长度不等的情况？

▷

是的，在 ok 函数的实现中首先检查了两个字符串 s 和 t 的长度是否相等。如果长度不相等，则直接返回 False，因为长度不同的字符串之间的汉明距离是未定义的。这个长度检查确保了只有长度相同的字符串才会进一步检验它们之间的字符差异，从而计算出恰好有一个字符不同的情况，即汉明距离为 1。

最长相邻不相等子序列 II

题目描述

代码结果

解释

代码细节讲解

相关问题