单词距离

难度:

标签:

题目描述

You have a large text file containing words. Given any two different words, find the shortest distance (in terms of number of words) between them in the file. If the operation will be repeated many times for the same file (but different pairs of words), can you optimize your solution?

Example:

Input: words = ["I","am","a","student","from","a","university","in","a","city"], word1 = "a", word2 = "student"
Output: 1

Note:

words.length <= 100000

代码结果

运行时间: 54 ms, 内存: 32.6 MB

/*
 * 思路：
 * 1. 使用流遍历单词数组，记录两个单词出现的位置。
 * 2. 使用一个列表来存储两个单词的位置。
 * 3. 使用流计算所有位置对的距离，并找出最小距离。
 */
import java.util.*;
import java.util.stream.*;

public class Solution {
    public int shortestDistance(String[] words, String word1, String word2) {
        List<Integer> positions1 = IntStream.range(0, words.length)
                                             .filter(i -> words[i].equals(word1))
                                             .boxed()
                                             .collect(Collectors.toList());
        List<Integer> positions2 = IntStream.range(0, words.length)
                                             .filter(i -> words[i].equals(word2))
                                             .boxed()
                                             .collect(Collectors.toList());
        return positions1.stream()
                         .flatMapToInt(p1 -> positions2.stream().mapToInt(p2 -> Math.abs(p1 - p2)))
                         .min()
                         .orElse(Integer.MAX_VALUE);
    }
}

解释

方法:

题解使用了双指针技术来找出两个指定单词的最短距离。首先，初始化两个指针 pre1 和 pre2 为一个非常大的数（题中使用了100000），这两个指针将分别记录最近一次遇到 word1 和 word2 的索引位置。然后遍历整个单词列表，每当找到一个 word1 或 word2，就更新相应的指针，并计算与另一个单词上次出现位置的距离，取所有这些距离的最小值作为结果。

时间复杂度:

O(n)

空间复杂度:

O(1)

代码细节讲解

🦆

为什么在初始化 pre1 和 pre2 时选择了 100000 这个值？这个值是否与输入的 words 数组长度有关？

▷

在初始化 pre1 和 pre2 为 100000 是为了设置一个足够大的初始值，这样即使第一次遇到 word1 或 word2，计算与另一个单词的距离时不会得到一个不合理的小值。一般情况下，这个值应该大于或等于 words 数组的最大可能长度，以确保它在任何合理的输入长度上都不会被误用为一个有效的索引。此选择不一定严格与 words 数组的实际长度相关，但应大于任何预期的数组长度，以避免任何潜在的计算错误。

🦆

如果在遍历过程中，遇到的第一个单词是 word2 而不是 word1，此时计算的距离是否准确？

▷

在这种情况下，计算的距离依然是准确的。因为无论是 word1 还是 word2 作为遇到的第一个单词，只要更新了对应的 pre1 或 pre2 指针，距离的计算始终是基于最后一次遇到这两个单词的索引进行的。尽管初始化的 pre1 和 pre2 都非常大，但在任何一个单词被找到后，相应的索引就会被更新，因此计算出的距离将反映真实情况。

🦆

在解决方案中，提到了如何优化多次查询的情况，具体是如何实现的？

▷

实际的题解代码中并没有直接提到关于多次查询优化的具体细节。然而，如果需要优化多次查询的情况，可以预处理 words 数组，创建两个字典或列表来存储每个单词出现的所有索引。这样，每次查询时，而不是重新遍历整个数组，我们可以直接访问这些预存的索引，使用更高效的方法（如双指针在两个排序列表上操作）来找到最小距离。这种预处理可以显著减少每次查询的时间复杂度，特别是在 words 数组很大或查询次数很多的场景下。

单词距离

题目描述

代码结果

解释

代码细节讲解

相关问题