统计完全子字符串

难度:

标签:

题目描述

You are given a string word and an integer k.

A substring s of word is complete if:

Each character in s occurs exactly k times.
The difference between two adjacent characters is at most 2. That is, for any two adjacent characters c1 and c2 in s, the absolute difference in their positions in the alphabet is at most 2.

Return the number of complete substrings of word.

A substring is a non-empty contiguous sequence of characters in a string.

Example 1:

Input: word = "igigee", k = 2
Output: 3
Explanation: The complete substrings where each character appears exactly twice and the difference between adjacent characters is at most 2 are: igigee, igigee, igigee.

Example 2:

Input: word = "aaabbbccc", k = 3
Output: 6
Explanation: The complete substrings where each character appears exactly three times and the difference between adjacent characters is at most 2 are: aaabbbccc, aaabbbccc, aaabbbccc, aaabbbccc, aaabbbccc, aaabbbccc.

Constraints:

1 <= word.length <= 10⁵
word consists only of lowercase English letters.
1 <= k <= word.length

代码结果

运行时间: 665 ms, 内存: 21.3 MB

/*
 * Problem: Find the number of fully matching substrings in the given string `word` where each character appears exactly `k` times,
 * and adjacent characters in the substring differ in their alphabetical positions by at most 2.
 *
 * Approach:
 * 1. Generate all substrings using IntStream.
 * 2. Use a frequency array to count the occurrences of each character in the substring.
 * 3. Check if each character in the substring appears exactly `k` times.
 * 4. Verify that the difference in the positions of adjacent characters is at most 2.
 * 5. Count the valid substrings.
 */
import java.util.stream.IntStream;

public int countFullyMatchingSubstringsStream(String word, int k) {
    int n = word.length();
    return (int) IntStream.range(0, n).flatMap(i -> IntStream.range(i + 1, n + 1)
            .mapToObj(j -> word.substring(i, j)))
            .filter(s -> {
                int[] freq = new int[26];
                s.chars().forEach(c -> freq[c - 'a']++);
                if (IntStream.of(freq).anyMatch(count -> count != 0 && count != k))
                    return false;
                return IntStream.range(1, s.length()).allMatch(i -> Math.abs(s.charAt(i) - s.charAt(i - 1)) <= 2);
            }).count();
}

解释

方法:

本题解采用了双指针法和哈希表来统计完全子字符串。首先，初始化一个长度为26的列表char_positions，用于存储每个字符在word中的出现位置。接着，遍历word中的每个字符，更新char_positions，并检查当前字符是否与前一个字符在字母表中的顺序相差超过2，如果是，则更新切割点cut。然后，检查当前字符出现的次数是否达到k次，如果达到，则调用check函数检查从pre（当前字符第k次出现的位置）到i（当前位置）的子字符串是否满足每个字符恰好出现k次的条件。如果满足，则增加答案计数器ans，并更新pre的位置，继续检查。最后，返回答案计数器ans。

时间复杂度:

O(n^2)

空间复杂度:

O(n)

代码细节讲解

🦆

在函数`check`中，为什么要单独处理当前字符出现次数不足k或超过k的情况，这与算法的哪些核心逻辑相关？

▷

在`check`函数中，处理当前字符出现次数不足k或超过k的情况是为了确保每个字符在考虑的子字符串中正好出现k次。这是算法的核心要求之一，因为我们的目标是找到所有的完全子字符串，即每个字符都恰好出现k次的子字符串。如果某个字符出现次数不足k，则该子字符串显然不满足条件。如果出现次数超过k，我们需要进一步检查这些额外的出现是否影响子字符串的边界，即可能需要调整子字符串的起始位置。这些逻辑确保了每次计数时，子字符串严格符合完全子字符串的定义。

🦆

请问为什么需要在相邻字符的字母表顺序相差超过2时更新切割点cut，这样做的目的是什么？

▷

在相邻字符的字母表顺序相差超过2时更新切割点cut的目的是为了优化算法性能，并减少不必要的检查。这种更新基于一个观察：如果两个连续字符在字母表中的位置差距较大，这很可能意味着它们在字符串中不会形成一个频繁出现的模式，因此可以从新的位置开始考虑新的完全子字符串的开始。这样做可以避免在不可能形成完全子字符串的区间内进行不必要的检查，从而提高整体的算法效率。

🦆

在`countCompleteSubstrings`函数中，如何判断一个子字符串从pre到i是否满足每个字符恰好出现k次，具体是通过哪些步骤实现的？

▷

在`countCompleteSubstrings`函数中，判断一个子字符串从pre到i是否每个字符恰好出现k次的判断是通过调用`check`函数实现的。具体步骤如下： 1. 从给定的起点pre开始，逐个检查26个字母的出现情况。 2. 对于每个字符，检查其在`char_positions`列表中记录的位置。如果该字符在考虑的范围内[left, right]出现次数不足k或超过k，则直接返回-1，表示该子字符串不满足条件。 3. 如果字符出现次数恰好为k，检查这些位置是否在当前考虑的子字符串范围内，并可能根据情况调整子字符串的起始位置new_left。 4. 如果所有字符都满足恰好出现k次的条件，最终返回调整后的起始位置new_left，这个位置即是验证通过的新的起始点。如果从任一位置开始的子字符串都不满足条件，则返回-1。

统计完全子字符串

题目描述

代码结果

解释

代码细节讲解

相关问题