用户网站访问行为分析
难度:
标签:
题目描述
代码结果
运行时间: 38 ms, 内存: 17.2 MB
// Leetcode Problem 1152: Analyze User Website Visit Pattern using Java Streams
// The task is to find the most visited pattern of length 3 (3 websites) by any user.
// If there are multiple patterns with the same highest frequency, return the lexicographically smallest one.
import java.util.*;
import java.util.stream.Collectors;
public class Solution {
public List<String> mostVisitedPattern(String[] username, int[] timestamp, String[] website) {
// Step 1: Combine the inputs into a list of Visit objects
List<Visit> visits = IntStream.range(0, username.length)
.mapToObj(i -> new Visit(username[i], timestamp[i], website[i]))
.sorted(Comparator.comparingInt(v -> v.timestamp))
.collect(Collectors.toList());
// Step 2: Create a map from user to their list of websites
Map<String, List<String>> userWebsites = visits.stream()
.collect(Collectors.groupingBy(v -> v.username, Collectors.mapping(v -> v.website, Collectors.toList())));
// Step 3: Count all 3-sequence patterns for each user
Map<List<String>, Integer> patternCount = new HashMap<>();
userWebsites.values().forEach(websites -> {
if (websites.size() < 3) return;
Set<List<String>> seen = new HashSet<>();
for (int i = 0; i < websites.size() - 2; i++) {
for (int j = i + 1; j < websites.size() - 1; j++) {
for (int k = j + 1; k < websites.size(); k++) {
List<String> pattern = Arrays.asList(websites.get(i), websites.get(j), websites.get(k));
if (seen.add(pattern)) {
patternCount.put(pattern, patternCount.getOrDefault(pattern, 0) + 1);
}
}
}
}
});
// Step 4: Find the most frequent pattern
return patternCount.entrySet().stream()
.max(Map.Entry.<List<String>, Integer>comparingByValue()
.thenComparing(Map.Entry::getKey, (a, b) -> a.toString().compareTo(b.toString())))
.map(Map.Entry::getKey)
.orElse(Collections.emptyList());
}
class Visit {
String username;
int timestamp;
String website;
Visit(String u, int t, String w) {
username = u;
timestamp = t;
website = w;
}
}
}
解释
方法:
这个题解的主要思路是首先根据用户、时间戳和网站信息组成一个以用户为键的字典,其中每个用户对应一个按时间排序的网站访问的列表。之后,对每个用户的访问记录生成所有可能的三元组访问模式,并用一个字典记录这些模式及其对应的不同用户集合。最后,遍历这个模式字典,找出被最多用户访问过的模式。如果有多个模式被相同数量的最多用户访问,则返回字典序最小的模式。
时间复杂度:
O(U*n^3 + k*log k)
空间复杂度:
O(U*n + k*U)
代码细节讲解
🦆
如果一个用户的访问记录少于3条,题解中提到直接返回不处理,这是否意味着这些用户的数据不会被考虑在内?这样做的合理性是什么?
▷🦆
您提到在寻找被最多用户访问的模式时,如果有多个模式被相同数量的最多用户访问,则返回字典序最小的模式。如何确保字典序比较的准确性和效率?
▷