This is part 2 of our advanced practice SQL interview questions and answers. We highly suggest that you read part 1 of our Advanced SQL interview Questions before reading this, since a lot of the concepts presented in this portion are discussed in more depth in part 1.

The problem is based on the tables presented below where salespeople have orders with certain customers that are in the Customers table.

Salesperson Customer
ID Name Age Salary
1 Abe 61 140000
2 Bob 34 44000
5 Chris 34 40000
7 Dan 41 52000
8 Ken 57 115000
11 Joe 38 38000
ID Name City Industry Type
4 Samsonic pleasant J
6 Panasung oaktown J
7 Samony jackson B
9 Orange Jackson B
Orders
Number order_date cust_id salesperson_id Amount
10 8/2/96 4 2 2400
20 1/30/99 4 8 1800
30 7/14/95 9 1 460
40 1/29/98 7 2 540
50 2/3/98 6 7 600
60 3/2/98 6 7 720
70 5/6/98 9 7 150


In the tables above, each order in the Orders table is associated with a given Customer through the cust_id foreign key column that references the ID column in the Customer table.

Here is the problem: find the largest order amount for each salesperson and the associated order number, along with the customer to whom that order belongs to. You can present your answer in any database’s SQL – MySQL, Microsoft SQL Server, Oracle, etc.

The answer to the problem and explanation


This question seems to be quite simple – but as you will soon find out it is deceptively complex. For each salesperson, all we need to retrieve is the largest order amount, and the associated order number. In order to retrieve that information we should be able to simply do a join between the Orders and Salesperson tables wherever the Salesperson.ID is equal to the Orders.salesperson_id (this would be our join predicate). Then, we could group the results of that join by the Orders.salesperson_id column and retrieve both the highest valued order (by using max(Amount)), and the associated Order number.

Let’s say that we choose to write our answer in MySQL. In MySQL we could legally write some code that looks like this:

SELECT Orders.Number, max(Amount) 
FROM Orders JOIN Salesperson 
ON Salesperson.ID = Orders.salesperson_id 
GROUP BY Orders.salesperson_id

And, if we run that code above it will return this as a result:

Number	      max(Amount)
30	        460
10	        2400
50	        720
20	        1800

The problem with the data returned in MySQL

But, there’s a problem with the results returned from running the SQL above, and it should be fairly obvious once you actually look at the data in the tables we have above. Here is the problem: Order number 50 does not have an amount of “720” – that amount actually belongs to order number 60. So, what is going on here? Why do the results return an order number that is not even in the same row as the max(Amount) of 720? And why are all of the other results correct?

Understanding the group by statement is critical

Well, we will have to explain a bit more about what’s going on with the group by. If you already read Part 1 of the advanced SQL interview questions then you should understand exactly what the problem is with the SQL above, and you can safely skip down to the section that says “New approach to the problem – start with a subquery”. Unless you want to reinforce the concepts presented in part 1, in which case we highly recommend that you read this entire explanation to this rather difficult interview question.

When we group by the salesperson ID, there will be one group created for each and every salesperson ID. So, there will be 6 groups created – 1 for ID of 1, another for ID of 2, and others for ID’s 5, 7, 8, and 11. Inside those groups will be any rows that share the same salesperson ID values.

When we select the max(Amount), MySQL will simply look for the highest value for Amount within each group and return that value. And when we select Orders.Number, MySQL is not going to return every Orders.Number value from each group – it is only going to select one value from each group.

Subscribe to our newsletter for more free interview questions.

Our SQL is not specific enough


But, the question is which order number should be returned from each group? Each group can potentially have more than just one order number as long as there are more than one rows belonging to the group. And that is the exact problem – the SQL that we wrote is not specific enough, and MySQL will just arbitrarily/randomly return one of the values of the Orders.Number within each group. In this case, because order number 50 is part of the group created by the salesperson_id’s of 7, it will return 50. MySQL could just as well have returned order numbers 60 or 70 – the point is that it just randomly chooses one order number from each group. For the group created by salesperson ID of 2, the fact that the order number 10 is chosen (order number 10 corresponds to the largest order amount of 2400) is just pure coincidence – MySQL could have returned us order number 40, which is also a part of the same group as salesperson ID of 2.

Most relational database implementations would have thrown an error if we tried to run the SQL above because the results are potentially arbitrary, as we just illustrated. MySQL is the exception, because it allows us to run the SQL above error-free, but as we illustrated the data returned could potentially not make any sense. Be sure to read Part 1 of the advanced SQL interview questions for more details on why.

Well, now we know that there is definitely an issue with the SQL above, so how can we write a good query that would give us exactly what we want – along with the correct order number?

New approach to the problem – start with a subquery

Now let’s instead just try to break the problem down into more manageable pieces – starting with a simple subquery. Here is a subquery to get the highest valued order for each salesperson:

SELECT salesperson_id, MAX(Amount) AS MaxOrder
FROM Orders
GROUP BY salesperson_id

Running the query above will return this:


salesperson_id MaxOrder
1 460
2 2400
7 720
8 1800

The query above gives us the salesperson_id and that salesperson’s associated highest order amount – but it still does not give us the order number associated with the highest order amount. So, how can we find the order number as well?

Clearly we need to do something else with the subquery we have above that will also give us the correct order number. What are our options? Try to come up with an answer on your own before reading on.

Well, we can do a join with the results of the subquery above. But, on what condition should our join be done and what exactly should we be joining the subquery above with?

What if we join our subquery above with data from the Orders table, where the join is done on the basis that the salesperson_id matches, AND that the value in the Order table’s Amount column is equal to the amount (MaxOrder) returned from the subquery? This way, we can match up the correct Order Number with the correct corresponding value for the maximum Order Amount for a given salesperson_id.

With that in mind, we can write this query:

select salesperson_id, Number as OrderNum, Amount from Orders 
JOIN (  -- this is our subquery from above:
SELECT salesperson_id, MAX(Amount) AS MaxOrder
FROM Orders
GROUP BY salesperson_id
) as TopOrderAmountsPerSalesperson
USING (salesperson_id)
 where Amount = MaxOrder

Running the query above returns us this:

salesperson_id OrderNum Amount
8 20 1800
1 30 460
2 10 2400
7 60 720

How does the query work exactly?

How does the query above work exactly? It’s actually pretty simple. First, the subquery (which is basically a derived table here, named TopOrderAmountsPerSalesperson) returns the orders with the highest dollar amounts per salesperson, and the associated salesperson ID. So, now we have each salesperson’s highest valued order and his/her ID in a derived table. That derived table (the results from the subquery) is then joined with the entire Orders table on the condition that the salesperson ID matches and that the Amount from the Orders table matches the MaxOrder amount returned from the derived table. What’s the point of this? Well, that join will give us the correct OrderNumber since it is matching on both the salesperson ID and the amount. Even if there are 2 rows with the same exact salesperson ID and amount it will not even matter because no matter which ordernumber is associated with that row, the result set will be exactly the same.

And remember that the whole reason we are doing this is to avoid the original problem with not being able to select a non-aggregated column with a group by.

Now, retrieving the salesperson name is simple. Try to figure it out on your own.

Here is how we retrieve the salesperson name – we just use another join with the Salesperson table and select the Name:

SELECT salesperson_id, Name, 
Orders.Number AS OrderNumber, Orders.Amount
FROM Orders
JOIN Salesperson 
ON Salesperson.ID = Orders.salesperson_id
JOIN (
SELECT salesperson_id, MAX( Amount ) AS MaxOrder
FROM Orders
GROUP BY salesperson_id
) AS TopOrderAmountsPerSalesperson
USING ( salesperson_id ) 
WHERE Amount = MaxOrder

Running the query above returns this:

salesperson_id Name OrderNumber Amount
1 Abe 30 460
2 Bob 10 2400
7 Dan 60 720
8 Ken 20 1800

And, finally we have our answer! But one last thing – let’s check for corner cases. What would happen if we add one more row to the table where a given salesperson has 2 or more orders that have the same value for the highest amount? For example, let’s add this row to the Orders table:


Number order_date cust_id salesperson_id Amount
80 02/19/94 7 2 2400

This now means that the salesperson with an ID of 2 has 2 orders with an amount of 2400 in the Orders table. And, if we run the SQL above again, we will get this as a result (note the extra row for Bob):


salesperson_id Name OrderNumber Amount
1 Abe 30 460
2 Bob 40 2400
7 Dan 60 720
8 Ken 20 1800
2 Bob 80 2400

Now, the question is if we only want one of Bob’s orders to show up, how can we eliminate the duplicate?
Again, try to figure this out on your own before reading our answer.

Well, we could add a GROUP BY salesperson_id, Amount to the end of the query, which would create separate groups for each unique combination of the salesperson ID and the Amount. This would give us a query that looks like this:

SELECT salesperson_id, Salesperson.Name, 
Number AS OrderNumber, Amount
FROM Orders
JOIN Salesperson 
ON Salesperson.ID = Orders.salesperson_id
JOIN (
SELECT salesperson_id, MAX( Amount ) AS MaxOrder
FROM Orders
GROUP BY salesperson_id
) AS TopOrderAmountsPerSalesperson
USING ( salesperson_id ) 
WHERE Amount = MaxOrder
GROUP BY salesperson_id, Amount

Now, running this query even with the duplicate row in the Orders table would return us this:


salesperson_id Name OrderNumber Amount
1 Abe 30 460
2 Bob 40 2400
7 Dan 60 720
8 Ken 20 1800

And that’s it – we are now good to go, and we have a final answer to this difficult interview question! This concludes our series of complex SQL interview questions – hopefully you found them challenging!


Hiring? Job Hunting? Post a JOB or your RESUME on our JOB BOARD >>

FOLLOW Varoon Sahgal, Author of ProgrammerInterviewon
  • Sashank Raj

    the best explanation ever..the corner case one was simply superb case.

  • Labhesh

    I think this can be made much simpler:

    here’s something I tried on SQLLite and it works:

    select max(orders.amount), customer.name ,sales.id, sales.name, orders.number
    from customer
    join orders on orders.cust_id = customer.id
    join salesperson sales on sales.id = orders.salesperson_id
    group by customer.name;

  • Anish Tuladhar

    select aa.salesperson_id, aa.Number, aa.Amount from orders aa inner join
    (
    select a.salesperson_id, max(Amount) amount from orders a left join salesperson b on a.salesperson_id = b.id
    group by a.salesperson_id
    ) bb
    on aa.salesperson_id = bb.salesperson_id and aa.Amount = bb.amount

  • Niranjan Sonachalam

    This can be easily achieved using Oracle Analytic functions as shown below.

    select distinct
    first_value(ord.amount) OVER (partition by slaes_person_id order by amount desc )as largest_order,
    first_value(sal.id) OVER (partition by slaes_person_id order by amount desc )as largest_sales_person_id,
    first_value(sal.name) OVER (partition by slaes_person_id order by amount desc )as largest_sales_person_name
    from
    test_orders ord,
    test_sales_person sal,
    test_customer cust
    where ord.slaes_person_id=sal.id
    and ord.customer_id=cust.id

  • Anil

    select s.id,s.name,b.numbers,c.name,B.amount from

    salesperson s,

    customer c,

    (select s.* from

    orders s,

    (select max(amount) as amount, salesperon_id from orders group by salesperon_id) A

    where s.salesperon_id=A.salesperon_id

    and s.amount=a.amount) B

    where s.id=b.salesperon_id

    and c.id=b.cust_id

  • Guest

    Why won’t this work?

    Select Salesperson.Name, Max(Orders.Amount) As MaxOrder, Customer.Name

    From Salesperson

    Inner join Order on Salesperson.id = order.salesperson_id

    Inner join Customer On Customer.id = orders.cust_id

    Group By Salesperson.Name, MaxOrder, Customer.Name

  • Vins

    SQL Query:
    select Orders.salesperson_id,Number as OrderNum, Amount from Orders

    JOIN (
    SELECT salesperson_id, MAX(Amount) AS MaxOrder
    FROM Orders
    GROUP BY salesperson_id
    )
    as TopOrderAmountsPerSalesperson
    ON orders.salesperson_id=TopOrderAmountsPerSalesperson.salesperson_id
    where Amount = MaxOrder

  • Sonlip

    select o.number, o.Amount, c.name, o.salesperson_id from [MyTest].[dbo].[order] o

    join customer c on o.cust_id = c.id

    join

    ( select max(o.amount) as maxAmount, o.salesperson_id as id from [MyTest].[dbo].[order] o

    join salesperson s on o.salesperson_id= s.id

    group by o.salesperson_id)s

    on o.Amount = s.maxAmount

    order by salesperson_id

  • swapnil p

    Hey man, u got it wrong:
    40 is not the order number against 2400, its 40. Also, customer name was also asked in question.2
    Bob

    40
    2400

    • swapnil p

      sorry, typo…
      its order number 20 thats should come against 2400

  • pri

    My version of tsql
    SELECT a.name,b.amount,b.number,c.name FROM salesperson a left join

    (SELECT Salespersonid, Amount,Number,cust_id,DENSE_RANK() Over (PARTITION BY Salespersonid ORDER BY Amount desc ) as AmountRank

    FROM Orders2) b

    on a.id=b.salespersonid

    Join customer c on c.id=b.cust_id

    WHERE b.AmountRank=1

  • sinha

    SELECT top 1 tblOrders.Amount,tblOrders.Number, dbo.tblSalesperson.Name,dbo.Customer.Name FROM
    dbo.tblOrders inner join dbo.tblSalesperson on dbo.tblSalesperson.Id= tblOrders.salesperson_id
    inner join dbo.Customer on dbo.Customer.Id=dbo.tblOrders.cust_id
    GROUP BY dbo.tblOrders.Amount,tblSalesperson.Name,tblOrders.Number ,dbo.Customer.Name
    order by amount desc

  • swarupa

    select top 1 orders.amount, salesperson.name, customers.name

    from ((Orders inner join salesperson on orders.salespersonid=salesperson.id )inner join

    customers on Customers.id=Orders.cust_id) order by orders.amount desc

  • Nilesh

    with CTEorder as

    (

    select S.NAME AS ‘Salesperson_Name’ , c.name as ‘Cust_Name’, o.numer as ‘order_no’,o.ORDER_DATE ,o.AMOUNT, ROW_NUMBER () over( partition by S.NAME, c.name order by o.AMOUNT desc) as rnk

    from SALESPERSON S inner join ORDERs o on s.ID=o.salesperson_id

    inner join CUSTOMER c on c.ID = o.CUST_ID

    )

    select Salesperson_Name, Cust_Name, order_no,ORDER_DATE, amount

    from CTEorder where rnk=1

  • Raman Bajaj

    Hi,
    The Below Query Will Work. I have tested it on oracle and it is giving desired output.

    with t (Numbers,SALEPERSON_Name,Cust_id,Amount) as
    (select Numbers,Name,Cust_id,Amount from
    (select Numbers,Name,Cust_id,Amount,row_number() over (partition by name order by amount desc) rn
    from
    (SELECT o.Numbers numbers ,s.name as name , o.cust_id cust_id ,max(Amount) amount
    FROM Orders o Inner JOIN Salesperson s
    ON s.ID = o.salesperson_id
    GROUP BY o.Numbers,s.name,o.cust_id)) where rn =1)

    Select t.Numbers as Order_Number, t.SALEPERSON_NAME as SALEPERSON_NAME,c.Name as Customer_Name,t.amount As MAX_AMOUNT
    from t,customer c
    where t.cust_id=c.Id
    order by order_NUMBER

    OUTPUT :-

    Order_Number SALEPERSON_NAME Customer_Name MAX_AMOUNT
    10 BOB Samsonic 2400
    20 KEN Samsonic 1800
    30 ABE Orange 460
    60 DAN panasung 720

  • Luke Swart

    Yes, JerryMie is right – the original question asks for the customer name and not the salesperson name, which is confusing. Just for fun, I wrote this query that gets both customer name and salesperson name. It also includes names for salesman who didn’t sell anything (via outer join):

    SELECT max_sale.amount amount, max_sale.Number number, max_sale.cust_name cust_name, s.name name
    FROM (SELECT max_order.Number Number, max_order.amount amount, max_order.sid sid, c.Name cust_name

    FROM (SELECT Number Number, o.Amount amount, salesperson_id AS sid, cust_id AS cid
    FROM Orders AS o, (SELECT MAX(Amount) amount, salesperson_id sid
    FROM Orders i
    GROUP BY salesperson_id) AS o2
    WHERE o.Amount=o2.amount AND
    o.salesperson_id=o2.sid) AS max_order JOIN
    Customer AS c ON
    c.id=max_order.cid) AS max_sale RIGHT OUTER JOIN
    Salesperson AS s ON
    s.id=max_sale.sid
    group by s.id

    Here’s the result:

    +——–+——–+———–+——-+
    | amount | number | cust_name | name |
    +——–+——–+———–+——-+
    | 460 | 30 | Orange | Abe |
    | 2400 | 40 | Samony | Bob |
    | NULL | NULL | NULL | Chris |
    | 720 | 60 | Panasung | Dan |
    | 1800 | 20 | Samsonic | Ken |
    | NULL | NULL | NULL | Joe |
    +——–+——–+———–+——-+

  • PG

    This is the working query for SQL Server

    Select s.name,c.name,o.number, o.amount

    FROM Orders o JOIN

    SalesPerson s ON

    s.ID = o.salesperson_id

    JOIN Customer c

    ON c.id = o.cust_id

    JOIN

    (Select salesperson_id, Max(amount) As MAxAmount

    From Orders

    Group by salesperson_id) AS d

    ON

    d.salesperson_id = o.salesperson_id

    WHERE Amount = MAxAmount

    Group by s.name,c.name,o.number, o.amount

  • vhf

    SELECT sp.name, MAX( amount ) , od.number, cst.name

    FROM salesperson AS sp
    INNER JOIN order AS od ON od.salesperson_id = sp.id
    INNER JOIN customer AS cst ON od.cust_id = cst.id
    GROUP BY sp.name

    this would work too, or am I wrong?

    • vhf

      never mind, it's explained in the first section why this would not work, sorry!

  • student

    Hi,
    The final query doesn't run in MS SQL Server because not all of the select fields are in the group by clause.
    How can I do the same in MS SQL Server?
    Thank for the helpful site.

    • vijay

      select distinct c.salesperson_id,a.name,b.name as cust_name,d.orderid, c.amount
      from test_salesperson a join (select salesperson_id,max(to_number(amount)) amount from test_orders group by salesperson_id) c on
      a.ID = c.salesperson_id
      join test_orders d on
      c.salesperson_id = d.salesperson_id and c.amount = d.amount
      join test_Customer b on
      d.cust_id = b.id;

  • JerryMie

    I believe the original question asked for the customer to whom the largest order belongs to and not the salesperson's name and therefore the query needs to be modified and maybe look something like this:
    SELECT salesperson_id, Customer.Name AS Customer,
    Number AS OrderNumber, Amount
    FROM Orders
    JOIN Customer
    ON Customer.ID = Orders.cust_id
    JOIN (
    SELECT salesperson_id AS sid, MAX( Amount ) AS MaxOrder
    FROM Orders
    GROUP BY salesperson_id
    ) AS TopOrders
    ON TopOrders.sid = Orders.salesperson_id
    WHERE Orders.Amount = TopOrders.MaxOrder

    • Saurabh

      good answer.. could be simplified more like the below :

      select o1.number, salesperson.name, customer.name, o1.amount from orders as o1

      INNER JOIN salesperson on o1.salesperson_id = salesperson.id

      INNER JOIN customer on o1.cust_id = customer.id

      where o1.amount =

      (select max(amount) from orders as o2 where o1.salesperson_id = o2.salesperson_id group by o2.salesperson_id);

  • Yogesh Singh

    Please correct the last result.

  • omar salem

    how is the very last result different from the one before it?
    (Bob) still appears 2 times

    • varoon10

      My mistake! I've removed the last entry for Bob in the last result. Thanks for pointing this out!