How to Join to first row

Question

I'll use a concrete, but hypothetical, example.

Each Order normally has only one line item:

Orders:

OrderGUID   OrderNumber
=========   ============
{FFB2...}   STL-7442-1      
{3EC6...}   MPT-9931-8A

LineItems:

LineItemGUID   Order ID Quantity   Description
============   ======== ========   =================================
{098FBE3...}   1        7          prefabulated amulite
{1609B09...}   2        32         spurving bearing

But occasionally there will be an order with two line items:

LineItemID   Order ID    Quantity   Description
==========   ========    ========   =================================
{A58A1...}   6,784,329   5          pentametric fan
{0E9BC...}   6,784,329   5          differential girdlespring

Normally when showing the orders to the user:

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM Orders
    INNER JOIN LineItems 
    ON Orders.OrderID = LineItems.OrderID

I want to show the single item on the order. But with this occasional order containing two (or more) items, the orders would appear be duplicated:

OrderNumber   Quantity   Description
===========   ========   ====================
STL-7442-1    7          prefabulated amulite
MPT-9931-8A   32         spurving bearing
KSG-0619-81   5          panametric fan
KSG-0619-81   5          differential girdlespring

What I really want is to have SQL Server just pick one, as it will be good enough:

OrderNumber   Quantity   Description
===========   ========   ====================
STL-7442-1    7          prefabulated amulite
MPT-9931-8A   32         differential girdlespring
KSG-0619-81   5          panametric fan

If I get adventurous, I might show the user, an ellipsis to indicate that there's more than one:

OrderNumber   Quantity   Description
===========   ========   ====================
STL-7442-1    7          prefabulated amulite
MPT-9931-8A   32         differential girdlespring
KSG-0619-81   5          panametric fan, ...

So the question is how to either

eliminate "duplicate" rows
only join to one of the rows, to avoid duplication

First attempt

My first naive attempt was to only join to the "TOP 1" line items:

SELECT Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM Orders
    INNER JOIN (
       SELECT TOP 1 LineItems.Quantity, LineItems.Description
       FROM LineItems
       WHERE LineItems.OrderID = Orders.OrderID) LineItems2
    ON 1=1

But that gives the error:

The column or prefix 'Orders' does not
match with a table name or alias name
used in the query.

Presumably because the inner select doesn't see the outer table.

I think (and correct me if I'm wrong) group by would require listing all the other columns, excluding the one where you don't want duplicates. Source — jyn, Jun 1, 2018 at 12:25
Does this answer your question? Fetch the rows which have the Max value for a column for each distinct value of another column — philipxy, Jan 18, 2023 at 2:21

Alexander Schmidt · Accepted Answer · 2020-11-20 17:11:13Z

1496

+50

SELECT   Orders.OrderNumber, LineItems.Quantity, LineItems.Description
FROM     Orders
JOIN     LineItems
ON       LineItems.LineItemGUID =
         (
         SELECT  TOP 1 LineItemGUID 
         FROM    LineItems
         WHERE   OrderID = Orders.OrderID
         )

In SQL Server 2005 and above, you could just replace INNER JOIN with CROSS APPLY:

SELECT  Orders.OrderNumber, LineItems2.Quantity, LineItems2.Description
FROM    Orders
CROSS APPLY
        (
        SELECT  TOP 1 LineItems.Quantity, LineItems.Description
        FROM    LineItems
        WHERE   LineItems.OrderID = Orders.OrderID
        ) LineItems2

Please note that TOP 1 without ORDER BY is not deterministic: this query you will get you one line item per order, but it is not defined which one will it be.

Multiple invocations of the query can give you different line items for the same order, even if the underlying did not change.

If you want deterministic order, you should add an ORDER BY clause to the innermost query.

Example sqlfiddle

edited Nov 20, 2020 at 17:11

Alexander Schmidt

5,6834 gold badges41 silver badges79 bronze badges

answered Jan 11, 2010 at 16:48

Quassnoi

420k91 gold badges624 silver badges619 bronze badges

3

Excellent, that works; moving TOP 1 from derived table clause to join clause.
– Ian Boyd
Jan 11, 2010 at 16:54
133

and the "OUTER JOIN" equivalent would be "OUTER APPLY"
– Alex from Jitbit
Dec 22, 2011 at 10:41
10

How about for LEFT OUTER JOIN?
– Alex Nolasco
Jan 31, 2012 at 23:11
8

How do you do this if the join is via a compound key/has multiple columns?
– Brett Ryan
Sep 12, 2012 at 7:20
13

CROSS APPLY instead INNER JOIN and OUTER APPLY instead LEFT JOIN (the same as LEFT OUTER JOIN).
– hastrb
Mar 15, 2018 at 12:04

| Show 16 more comments

Justin Fisher · Accepted Answer · 2012-04-06 22:34:38Z

149

I know this question was answered a while ago, but when dealing with large data sets, nested queries can be costly. Here is a different solution where the nested query will only be ran once, instead of for each row returned.

SELECT 
  Orders.OrderNumber,
  LineItems.Quantity, 
  LineItems.Description
FROM 
  Orders
  INNER JOIN (
    SELECT
      Orders.OrderNumber,
      Max(LineItem.LineItemID) AS LineItemID
    FROM
      Orders INNER JOIN LineItems
      ON Orders.OrderNumber = LineItems.OrderNumber
    GROUP BY Orders.OrderNumber
  ) AS Items ON Orders.OrderNumber = Items.OrderNumber
  INNER JOIN LineItems 
  ON Items.LineItemID = LineItems.LineItemID

edited Apr 6, 2012 at 22:34

answered Apr 6, 2012 at 21:25

Justin Fisher

1,8731 gold badge13 silver badges12 bronze badges

3

This is also much faster if your 'LineItemId' column is not indexed properly. Compared to the accepted answer.
– GER
Jan 20, 2015 at 16:43
5

But how would you do this if Max is not usable as you need to order by a column different to the one you want to return?
– NickG
Apr 24, 2015 at 16:04
2

you can order the derived table whichever way you want and use TOP 1 in SQL Server or LIMIT 1 in MySQL
– stifin
Jun 9, 2015 at 10:39
1

Found this to be much faster on larger data sets
– DotNetDublin
Jan 11, 2021 at 16:27
1

Could you please elaborate? As far as only syntax is concerned, your answer is as nested as Quassnoi's: exactly one subquery. You cannot just imply that one will run "for each row returned" and the other will not just because the syntax seems so. You have to include a plan.
– George Menoutis
Jan 21, 2021 at 18:15

| Show 1 more comment

Community · Accepted Answer · 2017-04-13 12:42:38Z

63

@Quassnoi answer is good, in some cases (especially if the outer table is big), a more efficient query might be with using windowed functions, like this:

SELECT  Orders.OrderNumber, LineItems2.Quantity, LineItems2.Description
FROM    Orders
LEFT JOIN 
        (
        SELECT  LineItems.Quantity, LineItems.Description, OrderId, ROW_NUMBER()
                OVER (PARTITION BY OrderId ORDER BY (SELECT NULL)) AS RowNum
        FROM    LineItems

        ) LineItems2 ON LineItems2.OrderId = Orders.OrderID And RowNum = 1

Sometimes you just need to test which query gives better performance.

edited Apr 13, 2017 at 12:42

CommunityBot

11 silver badge

answered Mar 3, 2016 at 16:14

BornToCode

9,8759 gold badges70 silver badges88 bronze badges

4

This is the only answer I found that does a real "Left" join, meaning it does not add any more lines then is in the "Left" table. You just need to put in subquery and add "where RowNum is not null"
– user890332
May 10, 2019 at 17:55
2

Agreed this is the best solution. This solution also does not require you to have a unique ID in the table you're joining to, and is much faster than the top voted answer. You can also add criteria for which row you prefer to return, rather than just taking a random row, by using an ORDER BY clause in the subquery.
– Geoff Griswald
Oct 3, 2019 at 14:06
This is a good solution. Please note: when using for your own situation, be very careful how you PARTION BY (usually you do probably want some ID column there) and ORDER BY (which could be done by most anything, depending on which row you want to keep, e.g. DateCreated desc would be one choice for some tables, but it would depend on a lot of things)
– JosephDoggie
Mar 23, 2020 at 20:30
What does imply the (SELECT NULL) in the OVER clause?
– Simone
Sep 8, 2021 at 9:30
Thanks. This worked for me as well as the accepted answer with left join was giving all the rows in the second table which I didn't wanted.
– Varun Sharma
Nov 15, 2022 at 19:40

Add a comment |

Tomalak · Accepted Answer · 2010-01-11 16:50:24Z

30

You could do:

SELECT 
  Orders.OrderNumber, 
  LineItems.Quantity, 
  LineItems.Description
FROM 
  Orders INNER JOIN LineItems 
  ON Orders.OrderID = LineItems.OrderID
WHERE
  LineItems.LineItemID = (
    SELECT MIN(LineItemID) 
    FROM   LineItems
    WHERE  OrderID = Orders.OrderID
  )

This requires an index (or primary key) on LineItems.LineItemID and an index on LineItems.OrderID or it will be slow.

answered Jan 11, 2010 at 16:50

Tomalak

335k67 gold badges541 silver badges634 bronze badges

2

This does not work if an Orders has no LineItems. The sub-expression then evaluates LineItems.LineItemID = null and removes the left entity orders completely from the result.
– leo
Jul 2, 2015 at 9:53
8

That's also the effect of the inner join, so... yeah.
– Tomalak
Jul 2, 2015 at 9:58
1

Solution that can be adapted for LEFT OUTER JOIN: stackoverflow.com/a/20576200/510583
– leo
Jul 2, 2015 at 10:38
5

@leo Yes, but the OP used an inner join himself, so I don't understand your objection.
– Tomalak
Jul 2, 2015 at 10:56

Add a comment |

P. Olesen · Accepted Answer · 2018-11-28 11:55:08Z

27

From SQL Server 2012 and onwards I think this will do the trick:

SELECT DISTINCT
    o.OrderNumber ,
    FIRST_VALUE(li.Quantity) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Quantity ,
    FIRST_VALUE(li.Description) OVER ( PARTITION BY o.OrderNumber ORDER BY li.Description ) AS Description
FROM    Orders AS o
    INNER JOIN LineItems AS li ON o.OrderID = li.OrderID

answered Nov 28, 2018 at 11:55

P. Olesen

3043 silver badges4 bronze badges

1

Doesn't the "INNER JOIN LineItems" cause multiple rows to be returned when an order has more than one line item?
– David Pierson
Jul 27, 2022 at 7:15
1

The DISTINCT should take care of this
– P. Olesen
Sep 16, 2022 at 13:32

Add a comment |

avb · Accepted Answer · 2017-05-10 10:18:11Z

,Another aproach using common table expression:

with firstOnly as (
    select Orders.OrderNumber, LineItems.Quantity, LineItems.Description, ROW_NUMBER() over (partiton by Orders.OrderID order by Orders.OrderID) lp
    FROM Orders
        join LineItems on Orders.OrderID = LineItems.OrderID
) select *
  from firstOnly
  where lp = 1

or, in the end maybe you would like to show all rows joined?

comma separated version here:

  select *
  from Orders o
    cross apply (
        select CAST((select l.Description + ','
        from LineItems l
        where l.OrderID = s.OrderID
        for xml path('')) as nvarchar(max)) l
    ) lines

Racil Hilan · Accepted Answer · 2016-09-17 10:38:08Z

12

Correlated sub queries are sub queries that depend on the outer query. It’s like a for loop in SQL. The sub-query will run once for each row in the outer query:

select * from users join widgets on widgets.id = (
    select id from widgets
    where widgets.user_id = users.id
    order by created_at desc
    limit 1
)

edited Sep 17, 2016 at 10:38

Racil Hilan

25k13 gold badges54 silver badges59 bronze badges

answered Sep 17, 2016 at 10:19

Abdullah Yousuf

4095 silver badges7 bronze badges

Add a comment |

Anand · Accepted Answer · 2017-05-09 18:12:59Z

My favorite way to run this query is with a not exists clause. I believe this is the most efficient way to run this sort of query:

select o.OrderNumber,
       li.Quantity,
       li.Description
from Orders as o
inner join LineItems as li
on li.OrderID = o.OrderID
where not exists (
    select 1
    from LineItems as li_later
    where li_later.OrderID = o.OrderID
    and li_later.LineItemGUID > li.LineItemGUID
    )

But I have not tested this method against other methods suggested here.

Peter Radocchia · Accepted Answer · 2010-01-11 17:23:56Z

4

EDIT: nevermind, Quassnoi has a better answer.

For SQL2K, something like this:

SELECT 
  Orders.OrderNumber
, LineItems.Quantity
, LineItems.Description
FROM (  
  SELECT 
    Orders.OrderID
  , Orders.OrderNumber
  , FirstLineItemID = (
      SELECT TOP 1 LineItemID
      FROM LineItems
      WHERE LineItems.OrderID = Orders.OrderID
      ORDER BY LineItemID -- or whatever else
      )
  FROM Orders
  ) Orders
JOIN LineItems 
  ON LineItems.OrderID = Orders.OrderID 
 AND LineItems.LineItemID = Orders.FirstLineItemID

edited Jan 11, 2010 at 17:23

answered Jan 11, 2010 at 16:59

Peter Radocchia

10.8k2 gold badges34 silver badges57 bronze badges

Add a comment |

Krease · Accepted Answer · 2013-02-14 22:07:30Z

3

Tried the cross, works nicely, but takes slightly longer. Adjusted line columns to have max and added group which kept speed and dropped the extra record.

Here's the adjusted query:

SELECT Orders.OrderNumber, max(LineItems.Quantity), max(LineItems.Description)
FROM Orders
    INNER JOIN LineItems 
    ON Orders.OrderID = LineItems.OrderID
Group by Orders.OrderNumber

edited Feb 14, 2013 at 22:07

Krease

16k8 gold badges55 silver badges87 bronze badges

answered Feb 14, 2013 at 21:47

ernst

711 bronze badge

15

But having max separately on two columns means the quantity might not be related to the description. If the order was 2 Widgets and 10 Gadgets, the query would return 10 Widgets.
– Brianorca
Aug 4, 2015 at 23:52

Add a comment |

Collectives™ on Stack Overflow

How to Join to first row

First attempt

10 Answers 10

Your Answer

Not the answer you're looking for? Browse other questions tagged
sql
sql-server
t-sql
sql-server-2000
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

First attempt

10 Answers 10

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged sqlsql-servert-sqlsql-server-2000 or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
sql
sql-server
t-sql
sql-server-2000
or ask your own question.