[Stratascratch-HARD] Activity Rank (Ranking window functions)

Stratascratch Google Interview Question - Activity Rank

  • Difficulty level : Hard
  • SQL window functions for time series analysis

Table : google_gmail_emails

id int
from_user varchar
to_user varchar
day int
  • Email activity rank = total number of emails sent
  • Output Columns : from_user | total_emails | activity_rank
  • ORDER BY total_emails DESC, from_user -- ASC
  • unique rank value → RANK()

My Solution

  1. Get a total_emails by user
  2. Get the rank - 2 ordering steps : total_emails DESC, from_user ASC
  3. Sort the result : total_emails DESC, from_user ASC
SELECT 
    from_user,

    -- the number of emails by from_user with GROUP BY
    count(*) AS total_emails,
    -- rank window function 
    -- two disctinct ordering steps : total_emails DESC, from_user ASC
    RANK() OVER (ORDER BY count(*) DESC, from_user) as `activity_rank`

FROM google_gmail_emails
-- aggregate the number of emails by from_user
GROUP BY 1

-- sort the result
ORDER BY 2 DESC, 1