[Stratascratch - HARD] Best Selling Item (GROUP BY 없이 월별 합계구하기, 중복 레코드 제거하기)

Amazon Interview Question

Problem

  • Difficulty level : Hard (Hard level대비 쉬운 문제)

Interview Question Date: July 2020

  • Find the best selling item for each month where the biggest total invoice was paid. (no need to separate months by year)
  • The best selling item is calculated using the formula (unitprice * quantity).
  • Output the description of the item along with the amount paid.

Table : online_retail

invoiceno varchar
stockcode varchar
description varchar
quantity int
invoicedate datetime
unitprice float
customerid float
country varchar

My Solution

  1. 월별&상품별 total_amount 합계를 구해 CTE1을 만든다.
  2. CTE1total_amount 기준으로 월별&상품별 랭킹 rk을 포함한 CTE2 를 만든다.
    • RANK() OVER (PARTITION BY month ORDER BY total_amount DESC)
      ( RANK() vs RANK_DENSE() 선택 기준이 문제에 제공되어 있지 않았기 때문에 금액이 동일하면 동일한 순위로 보는 것으로 윈도우함수 RANK() 사용 )
    • PARTITION BY와 별도로 GROUP BY month, description를 사용해 DUPLICATES을 제거한다.
  3. CTE2에서 rk=1인 레코드를 출력한다.
WITH CTE1 AS (
    SELECT
        description,

        -- EXTRACT MONTH FROM DATE COLUMN
        EXTRACT(MONTH FROM invoicedate) AS `month`,

        -- best selling item formula : (unitprice * quantity) = `total_amount`
        -- AGGREGATE - MONTHLY 
        SUM(unitprice*quantity) OVER (PARTITION BY description, EXTRACT(MONTH FROM invoicedate)) AS `total_amount`
    FROM online_retail
)

, CTE2 AS (
    SELECT 
        -- REMOVAL OF DUPLICATES
        DISTINCT month,
        description,
        total_amount,

        -- the biggest total invoice was paid for each month 
        RANK() OVER (PARTITION BY month ORDER BY total_amount DESC) AS `rk`
    FROM CTE1
)

-- description along with the amount paid.
SELECT 
    month,
    description, 
    total_amount
FROM CTE2

-- best selling item for each month 
WHERE rk = 1

Lessons learned

REMOVAL OF DUPLICATES

Eliminating Duplicate Rows from MySQL Result Sets | Database Journal