字符串中的额外字符

难度:

标签:

题目描述

You are given a 0-indexed string s and a dictionary of words dictionary. You have to break s into one or more non-overlapping substrings such that each substring is present in dictionary. There may be some extra characters in s which are not present in any of the substrings.

Return the minimum number of extra characters left over if you break up s optimally.

Example 1:

Input: s = "leetscode", dictionary = ["leet","code","leetcode"]
Output: 1
Explanation: We can break s in two substrings: "leet" from index 0 to 3 and "code" from index 5 to 8. There is only 1 unused character (at index 4), so we return 1.

Example 2:

Input: s = "sayhelloworld", dictionary = ["hello","world"]
Output: 3
Explanation: We can break s in two substrings: "hello" from index 3 to 7 and "world" from index 8 to 12. The characters at indices 0, 1, 2 are not used in any substring and thus are considered as extra characters. Hence, we return 3.

Constraints:

1 <= s.length <= 50
1 <= dictionary.length <= 50
1 <= dictionary[i].length <= 50
dictionary[i] and s consists of only lowercase English letters
dictionary contains distinct words

代码结果

运行时间: 52 ms, 内存: 16.6 MB

/*
 * 思路：
 * 与传统的Java解决方案类似，我们依然使用动态规划来解决问题。
 * 不同的是，我们将利用Java Stream API来简化代码的编写，特别是在更新dp数组的部分。
 */
import java.util.*;
import java.util.stream.*;

public class MinRemainingCharsStream {
    public int minRemainingChars(String s, List<String> dictionary) {
        int n = s.length();
        Set<String> dict = new HashSet<>(dictionary);
        int[] dp = new int[n + 1];
        Arrays.fill(dp, n);
        dp[0] = 0;

        IntStream.rangeClosed(1, n).forEach(i -> {
            IntStream.range(0, i).forEach(j -> {
                if (dict.contains(s.substring(j, i))) {
                    dp[i] = Math.min(dp[i], dp[j]);
                }
            });
            dp[i] = Math.min(dp[i], dp[i - 1] + 1);
        });

        return dp[n];
    }

    public static void main(String[] args) {
        MinRemainingCharsStream solution = new MinRemainingCharsStream();
        List<String> dictionary = Arrays.asList("leet", "code", "leetcode");
        String s = "leetscode";
        System.out.println(solution.minRemainingChars(s, dictionary)); // 输出1
    }
}

解释

方法:

题解采用动态规划的方法来解决字符串分割问题。首先，创建一个字典 dic，将每个单词按其最后一个字符分类存储。使用动态规划数组 dp，其中 dp[i] 表示字符串 s 的前 i 个字符中未被使用的最小字符数量。遍历字符串 s 的每个字符，并检查是否有字典中的单词以该字符结束，如果是，则尝试匹配整个单词。如果匹配成功，更新 dp 数组以尝试减少未使用的字符数量。最终，dp 数组的最后一个元素 dp[n]（其中 n 是字符串 s 的长度）给出了分割后剩余的最少字符数。

时间复杂度:

O(n * m)

空间复杂度:

O(n)

代码细节讲解

🦆

在此题解中，为何选择使用字典将单词按照最后一个字符分类存储？这种方法在解决问题时有什么特别的优势？

▷

在此题解中，字典的使用主要是为了快速定位和匹配以特定字符结束的单词。当处理字符串s中的每一个字符时，可以直接通过这个字符找到所有可能以此字符为结尾的单词，然后尝试匹配。这种方法减少了不必要的比对，提高了效率，使得只有在有潜在匹配可能时才进行字符串的比对，从而优化了算法的执行时间。

🦆

动态规划数组 dp[i] 的定义是表示字符串 s 的前 i 个字符中未被使用的最小字符数量，能否详细解释如何通过前一个状态 dp[i-1] 来更新当前状态 dp[i]？

▷

动态规划数组dp的每个位置i代表的是从字符串s的开头到第i个字符最少的未使用字符数量。更新dp[i]时，首先将dp[i-1]的值加1，假设第i个字符未被任何单词使用。随后检查是否有以s[i]结尾的单词，如果找到，则尝试从i减去该单词长度的位置开始匹配整个单词。如果匹配成功，比较dp[i]（考虑当前字符未使用的情况）和dp[k]（匹配单词后的位置）的值，取较小的那个，以确保dp[i]表示的是最小的未使用字符数量。

🦆

题解中提到，尝试匹配字典中的单词时，只有当单词完全匹配时才更新 dp 数组。请问如果单词部分匹配该如何处理？是否有必要记录部分匹配的状态？

▷

题解中仅在单词完全匹配时更新dp数组，因为只有完全匹配的单词才能确认从匹配点到当前字符间没有未使用的字符。如果单词只是部分匹配，那么这部分匹配不能帮助减少未使用的字符数量，因此没有必要记录部分匹配的状态。记录部分匹配可能引入复杂度并且不会对解题产生帮助，因为它不改变任何dp数组的值。

🦆

在更新 dp 数组时，你选择了一个条件 `if dp[-1] > dp[k]` 来决定是否更新 dp 的当前值。这种条件判断的逻辑依据是什么？

▷

这种条件判断的逻辑依据是寻找使得未使用字符数量最小化的可能。dp[k]表示在匹配到当前检查的单词之前的最小未使用字符数量。如果dp[k]的值加上从k到当前字符的匹配（假设是0，因为假定单词完全匹配了）小于仅将当前字符视为未使用的情况（dp[-1]），那么更新dp[-1]为dp[k]会使得未使用字符数量达到当前可知的最小值。这保证了dp数组确实表示每个位置可能的最小未使用字符数。

字符串中的额外字符

题目描述

代码结果

解释

代码细节讲解

相关问题