最长公共后缀查询

难度:

标签:

题目描述

You are given two arrays of strings wordsContainer and wordsQuery.

For each wordsQuery[i], you need to find a string from wordsContainer that has the longest common suffix with wordsQuery[i]. If there are two or more strings in wordsContainer that share the longest common suffix, find the string that is the smallest in length. If there are two or more such strings that have the same smallest length, find the one that occurred earlier in wordsContainer.

Return an array of integers ans, where ans[i] is the index of the string in wordsContainer that has the longest common suffix with wordsQuery[i].

Example 1:

Input: wordsContainer = ["abcd","bcd","xbcd"], wordsQuery = ["cd","bcd","xyz"]

Output: [1,1,1]

Explanation:

Let's look at each wordsQuery[i] separately:

For wordsQuery[0] = "cd", strings from wordsContainer that share the longest common suffix "cd" are at indices 0, 1, and 2. Among these, the answer is the string at index 1 because it has the shortest length of 3.
For wordsQuery[1] = "bcd", strings from wordsContainer that share the longest common suffix "bcd" are at indices 0, 1, and 2. Among these, the answer is the string at index 1 because it has the shortest length of 3.
For wordsQuery[2] = "xyz", there is no string from wordsContainer that shares a common suffix. Hence the longest common suffix is "", that is shared with strings at index 0, 1, and 2. Among these, the answer is the string at index 1 because it has the shortest length of 3.

Example 2:

Input: wordsContainer = ["abcdefgh","poiuygh","ghghgh"], wordsQuery = ["gh","acbfgh","acbfegh"]

Output: [2,0,2]

Explanation:

Let's look at each wordsQuery[i] separately:

For wordsQuery[0] = "gh", strings from wordsContainer that share the longest common suffix "gh" are at indices 0, 1, and 2. Among these, the answer is the string at index 2 because it has the shortest length of 6.
For wordsQuery[1] = "acbfgh", only the string at index 0 shares the longest common suffix "fgh". Hence it is the answer, even though the string at index 2 is shorter.
For wordsQuery[2] = "acbfegh", strings from wordsContainer that share the longest common suffix "gh" are at indices 0, 1, and 2. Among these, the answer is the string at index 2 because it has the shortest length of 6.

Constraints:

1 <= wordsContainer.length, wordsQuery.length <= 10⁴
1 <= wordsContainer[i].length <= 5 * 10³
1 <= wordsQuery[i].length <= 5 * 10³
wordsContainer[i] consists only of lowercase English letters.
wordsQuery[i] consists only of lowercase English letters.
Sum of wordsContainer[i].length is at most 5 * 10⁵.
Sum of wordsQuery[i].length is at most 5 * 10⁵.

代码结果

运行时间: 179 ms, 内存: 28.7 MB

/*
 * 题目思路：
 * 对于每个wordsQuery[i]，需要找到wordsContainer中与其有最长公共后缀的字符串。
 * 如果有多个字符串有相同长度的最长公共后缀，则选择长度最短的字符串。
 * 如果仍有多个字符串满足条件，则选择wordsContainer中最早出现的字符串。
 */
import java.util.Arrays;
import java.util.Comparator;

public class Solution {
    public int[] findLongestCommonSuffix(String[] wordsContainer, String[] wordsQuery) {
        return Arrays.stream(wordsQuery)
            .mapToInt(query -> {
                return IntStream.range(0, wordsContainer.length)
                    .boxed()
                    .max(Comparator.comparingInt((Integer index) -> getCommonSuffixLength(wordsContainer[index], query))
                            .thenComparingInt(index -> -wordsContainer[index].length())
                            .thenComparingInt(index -> -index))
                    .orElse(-1);
            }).toArray();
    }

    private int getCommonSuffixLength(String s1, String s2) {
        int len1 = s1.length();
        int len2 = s2.length();
        int len = 0;
        while (len < len1 && len < len2 && s1.charAt(len1 - len - 1) == s2.charAt(len2 - len - 1)) {
            len++;
        }
        return len;
    }
}

解释

方法:

此题解使用了基于字符匹配的分治递归策略，通过Trie（字典树）的思想来解决问题。首先，将wordsContainer和wordsQuery中的所有字符串逆序，目的是将最长公共后缀问题转换为字符串前缀匹配问题，这样可以利用Trie树的性质进行处理。在递归函数f中，对于当前层的字符索引zidx，将容器wordsContainer中的字符串根据当前字符分组，相同的字符分到同一个组中，没有当前字符的字符串（即短于zidx的字符串）放入一个单独的列表中。对于每一个查询字符串，根据其当前字符进行匹配：如果匹配到相应的字符组，则对这个组递归处理；如果没有匹配到，则在短字符串列表中选择一个（如果存在）。递归终止的条件是当只剩一个字符串需要处理时，直接将其分配给所有对应的查询。此策略有效地将大问题分解为小问题，递归地解决每一层的匹配问题。

时间复杂度:

O(n * l + m * l)

空间复杂度:

O(n * l + m * l)

代码细节讲解

🦆

为什么在处理最长公共后缀问题时选择将所有字符串逆序，并转换为前缀匹配问题？

▷

将字符串逆序是为了将最长公共后缀问题转化为更易处理的最长公共前缀问题。在数据结构中，字典树（Trie树）是处理字符串前缀匹配问题的常用方式，其可以高效地通过逐字符比较来搜索和匹配字符串的公共前缀。通过逆序，我们可以重新利用字典树的这一特性来处理后缀匹配问题，从而使得算法更加直观和简洁。

🦆

在字典树Trie的递归分治解法中，如何确保如果有多个字符串满足最长公共后缀，返回的是长度最短的字符串？

▷

在实现中，为了确保在有多个字符串满足条件时返回最短的字符串，算法首先会根据字符串长度和索引对wordsContainer中的字符串进行排序。这样，在递归函数f中，当出现无法继续匹配当前字符的情况时，优先选择的是列表lst中的第一个元素，即最短的字符串。这一设计确保了在满足最长公共后缀的字符串中，返回的是长度最短的一个。

🦆

递归函数f中，当无法匹配当前字符时选择短字符串列表中的第一个字符串作为结果，是否有可能导致没有选到实际具有最长公共后缀的字符串？

▷

是的，这种情况有可能发生。在递归函数f中，当当前字符无法匹配时，算法会选择短字符串列表lst中的第一个字符串作为结果。这可能不是实际具有最长公共后缀的字符串，尤其是当短字符串列表中的字符串并不包含其他更长字符串中的公共后缀时。这种方法在某些情况下可能会牺牲结果的准确性，以换取算法的简化和执行效率。

最长公共后缀查询

题目描述

代码结果

解释

代码细节讲解

相关问题