用户网站访问行为分析

难度:

标签:

题目描述

代码结果

运行时间: 38 ms, 内存: 17.2 MB

// Leetcode Problem 1152: Analyze User Website Visit Pattern using Java Streams
// The task is to find the most visited pattern of length 3 (3 websites) by any user.
// If there are multiple patterns with the same highest frequency, return the lexicographically smallest one.

import java.util.*;
import java.util.stream.Collectors;

public class Solution {
    public List<String> mostVisitedPattern(String[] username, int[] timestamp, String[] website) {
        // Step 1: Combine the inputs into a list of Visit objects
        List<Visit> visits = IntStream.range(0, username.length)
                                      .mapToObj(i -> new Visit(username[i], timestamp[i], website[i]))
                                      .sorted(Comparator.comparingInt(v -> v.timestamp))
                                      .collect(Collectors.toList());

        // Step 2: Create a map from user to their list of websites
        Map<String, List<String>> userWebsites = visits.stream()
                                                       .collect(Collectors.groupingBy(v -> v.username, Collectors.mapping(v -> v.website, Collectors.toList())));

        // Step 3: Count all 3-sequence patterns for each user
        Map<List<String>, Integer> patternCount = new HashMap<>();
        userWebsites.values().forEach(websites -> {
            if (websites.size() < 3) return;
            Set<List<String>> seen = new HashSet<>();
            for (int i = 0; i < websites.size() - 2; i++) {
                for (int j = i + 1; j < websites.size() - 1; j++) {
                    for (int k = j + 1; k < websites.size(); k++) {
                        List<String> pattern = Arrays.asList(websites.get(i), websites.get(j), websites.get(k));
                        if (seen.add(pattern)) {
                            patternCount.put(pattern, patternCount.getOrDefault(pattern, 0) + 1);
                        }
                    }
                }
            }
        });

        // Step 4: Find the most frequent pattern
        return patternCount.entrySet().stream()
                           .max(Map.Entry.<List<String>, Integer>comparingByValue()
                           .thenComparing(Map.Entry::getKey, (a, b) -> a.toString().compareTo(b.toString())))
                           .map(Map.Entry::getKey)
                           .orElse(Collections.emptyList());
    }

    class Visit {
        String username;
        int timestamp;
        String website;

        Visit(String u, int t, String w) {
            username = u;
            timestamp = t;
            website = w;
        }
    }
}

解释

方法:

这个题解的主要思路是首先根据用户、时间戳和网站信息组成一个以用户为键的字典，其中每个用户对应一个按时间排序的网站访问的列表。之后，对每个用户的访问记录生成所有可能的三元组访问模式，并用一个字典记录这些模式及其对应的不同用户集合。最后，遍历这个模式字典，找出被最多用户访问过的模式。如果有多个模式被相同数量的最多用户访问，则返回字典序最小的模式。

时间复杂度:

O(U*n^3 + k*log k)

空间复杂度:

O(U*n + k*U)

代码细节讲解

🦆

如果一个用户的访问记录少于3条，题解中提到直接返回不处理，这是否意味着这些用户的数据不会被考虑在内？这样做的合理性是什么？

▷

是的，如果一个用户的访问记录少于3条，这意味着他们的数据不会被考虑在内来形成三元组模式。这样做的合理性在于，生成三元组需要至少三个网站访问记录，否则无法形成有效的三元组序列。此外，忽略这些数据可以减少计算量和简化问题的处理，因为包含少于三个访问记录的用户无法对寻找最常访问模式的任务提供贡献。

🦆

您提到在寻找被最多用户访问的模式时，如果有多个模式被相同数量的最多用户访问，则返回字典序最小的模式。如何确保字典序比较的准确性和效率？

▷

在Python中，字典序比较可以直接通过元组比较实现，因为元组的比较是按照字典序进行的，首先比较元组第一个元素，如果相同则比较下一个元素，以此类推。这种比较方法是内建的，非常高效。在题解中，我们保持每个三元组模式为一个元组形式，这样可以直接利用Python的元组比较操作来确保字典序的准确性和比较效率。此方法不需要额外的数据结构或复杂的逻辑，保证了操作的简洁性和执行速度。

用户网站访问行为分析

题目描述

代码结果

解释

代码细节讲解

相关问题