[Stratascratch - HARD] Viewers Turned Streamers (GROUP BY없이 집계함수 사용하기, 제일 처음⇒ ROW_NUMBER?)

Twitch Interview Question - Viewers Turned Streamers

Problem

  • Difficulty level : Hard (다른 Hard level대비 아주 간단한 문제)

Find the number of streamer sessions by users who had their first session as a viewer.
Return the user id and number of sessions in descending order.
In case there are users with the same number of sessions, order them by ascending user id.

Table : twitch_sessions

user_id int
session_start datetime
session_end datetime
session_id int
session_type varchar

My Solution

My Logic

  1. 유저별 첫 세션 정보 도 포함한 CTE를 만든다.
    • ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY session_start ASC) = 1으로 유저별 첫 세션이 무엇인 지를 알 수 있는 변수 first_session를 만든다.
  2. population에서 첫 세션이 viewer session 인 유저로 필터링한다.
    • user_id IN ( SELECT user_id FROM population WHERE first_session = "viewer") WHERE 조건절에 서브쿼리를 사용한다.
  3. streamer session의 수를 집계한다.

CTE & SUBQUERY

/*
SELECT user_id, number_of_sessions 
-- ORDER BY number of sessions in descending & user id in ascending
ORDER BY 2 DESC, 1 -- ASC
*/

WITH population AS
(
    SELECT
        user_id,
        session_id,
        session_type,
        CASE 
            -- TO GET THE FIRST SESSION TYPE ORDER BY SESSION_START IN ASCENDING
            WHEN ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY session_start ASC) = 1 
            THEN session_type 
            END 
            AS `first_session`
    FROM twitch_sessions
)
SELECT 
    user_id,
    -- count the streamer sessions by users - GROUP BY user_id
    count(session_id) OVER (PARTITION BY user_id) AS number_of_sessions
FROM  population


WHERE 
    -- user_id WHOSE FIRST SESSION IS "VIEWER"
    user_id IN ( SELECT user_id FROM population WHERE first_session = "viewer")
    AND
    -- TO GET THE NUMBER OF STEAMER SESSIONS
    session_type = "streamer"

-- SORT THE RESULT 
ORDER BY 2 DESC, 1

Lessons learned

( 쿼리 가독성을 이유로 CTE를 선호하다 보니 코드가 불필요하게 길어질 때가 있는데, 간단하다면 서브쿼리도 적절히 사용하자. )

✨ 집계함수 + GROUP BY ⇒ GROUP BY 없이 집계함수 사용하기

집계함수 + GROUP BY

SELECT 
    user_id,
    -- count the streamer sessions by users - GROUP BY user_id
    count(session_id) AS number_of_sessions
FROM  population
...
-- TO AGGREGATE THE NUMBER OF SESSIONS
GROUP BY 1

GROUP BY 없이 집계함수 사용하기

SELECT 
    user_id,
    -- count the streamer sessions by users - GROUP BY user_id
    count(session_id) OVER (PARTITION BY user_id) AS number_of_sessions
FROM  population