Problem
- Difficulty level : Hard (다른 Hard level대비 아주 간단한 문제)
Find the number of streamer sessions by users who had their first session as a viewer.
Return the user id and number of sessions in descending order.
In case there are users with the same number of sessions, order them by ascending user id.
Table : twitch_sessions
user_id | int |
---|---|
session_start | datetime |
session_end | datetime |
session_id | int |
session_type | varchar |
My Solution
My Logic
- 유저별 첫 세션 정보 도 포함한 CTE를 만든다.
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY session_start ASC) = 1
으로 유저별 첫 세션이 무엇인 지를 알 수 있는 변수first_session
를 만든다.population
에서 첫 세션이 viewer session 인 유저로 필터링한다.
user_id IN ( SELECT user_id FROM population WHERE first_session = "viewer")
WHERE 조건절에 서브쿼리를 사용한다.- streamer session의 수를 집계한다.
CTE & SUBQUERY
/*
SELECT user_id, number_of_sessions
-- ORDER BY number of sessions in descending & user id in ascending
ORDER BY 2 DESC, 1 -- ASC
*/
WITH population AS
(
SELECT
user_id,
session_id,
session_type,
CASE
-- TO GET THE FIRST SESSION TYPE ORDER BY SESSION_START IN ASCENDING
WHEN ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY session_start ASC) = 1
THEN session_type
END
AS `first_session`
FROM twitch_sessions
)
SELECT
user_id,
-- count the streamer sessions by users - GROUP BY user_id
count(session_id) OVER (PARTITION BY user_id) AS number_of_sessions
FROM population
WHERE
-- user_id WHOSE FIRST SESSION IS "VIEWER"
user_id IN ( SELECT user_id FROM population WHERE first_session = "viewer")
AND
-- TO GET THE NUMBER OF STEAMER SESSIONS
session_type = "streamer"
-- SORT THE RESULT
ORDER BY 2 DESC, 1
Lessons learned
( 쿼리 가독성을 이유로 CTE를 선호하다 보니 코드가 불필요하게 길어질 때가 있는데, 간단하다면 서브쿼리도 적절히 사용하자. )
✨ 집계함수 + GROUP BY ⇒ GROUP BY 없이 집계함수 사용하기
집계함수 + GROUP BY
SELECT
user_id,
-- count the streamer sessions by users - GROUP BY user_id
count(session_id) AS number_of_sessions
FROM population
...
-- TO AGGREGATE THE NUMBER OF SESSIONS
GROUP BY 1
GROUP BY 없이 집계함수 사용하기
SELECT
user_id,
-- count the streamer sessions by users - GROUP BY user_id
count(session_id) OVER (PARTITION BY user_id) AS number_of_sessions
FROM population