Merging tables with “Union” and “Union All”

I’m going to write about SQL Unions. This is a really handy function that lets you join 2 or more tables into one big “uber-table”.

The syntax is quite simple. It’s either Union or Union All. 

You would use these functions if you needed to join two (or more) tables that were the same structure but in completely separate databases. This is common in companies split into several businesses. The accounts or business package to record transactions for the different businesses is often used for each of these businesses so that records can be held separately. The management team of this company will probably want to consolidate the transactions for their monthly reporting so that they can see the results for the larger company as a whole.

Here is an example of how the script would be written:

Select

SalesOrderNumber

,OrderDate

,CustomerID

,ProductID

,OrderQty

,UnitPrice

,UnitPriceDiscount

,LineTotal

 

From AdventureWorks.Sales.SalesOrderDetail

Inner Join AdventureWorks.Sales.SalesOrderHeader OnSales.SalesOrderDetail.SalesOrderID=Sales.SalesOrderHeader.SalesOrderID

 

Union All

 

Select

SalesOrderNumber

,OrderDate

,CustomerID

,ProductID

,OrderQty

,UnitPrice

,UnitPriceDiscount

,LineTotal

 

From AdventureWorksLT.SalesLT.SalesOrderDetail

Inner Join AdventureWorksLT.SalesLT.SalesOrderHeader OnAdventureWorksLT.SalesLT.SalesOrderDetail.SalesOrderID=

AdventureWorksLT.SalesLT.SalesOrderHeader.SalesOrderID

 

This script, when run, will merge the two select queries and present the data as one continuous table. There are some rules involved though, but when you think about them they are quite reasonable.

 

  1.       The fields from each select statement must be the same type
  2.       These fields must also have the same name
  3.       …However, the table names must be different
  4.       You must use the same number of fields in each select statement i.e. if you have eight fields in the first select statement, then you must have eight fields in all subsequent select statements

 

Theoretically this means that you can only use Unions on tables that have an identical design, however in practice there is a way of manipulating the script so that two tables of different design can still be joined using the Union function. For example:

Select

Title

,FirstName

,MiddleName

,LastName

,EmailAddress

,Phone

,0 As YearlyIncome

,0 As TotalChildren

From AdventureWorks.Person.Contact

 

Union All

 

Select

Title

,FirstName

,MiddleName

,LastName

,EmailAddress

,Phone

,YearlyIncome

,TotalChildren

From AdventureWorksDW..DimCustomer

 

The AdventureWorks.Person.Contact table does not contain the “YearlyIncome” or “TotalChildren” fields but in order to merge the two tables, I have to “build them in” to the script somehow. The beauty of SQL scripts is that you can “slice and dice” tables as you need to, so in this case I simply “created” the missing fields in the AdventureWorks.Person.Contact table by adding these “fields” to the table:

,0 As YearlyIncome

,0 As TotalChildren

I used the zero’s here because the YearlyIncome and TotalChildren fields are both numeric. If these fields had been character fields, then I would have done this instead:

,” As YearlyIncome

,” As TotalChildren

 

Whenever you want to include your own text in a SQL script you always put your text  between ”.

There is nothing between the single quotes ( in the last example), so this counts as a blank field, and blank fields count as text too.

If I were to save this Union All script as a view, I could do it in another way too:

Create View CustomerListView

(Title

,FirstName

,MiddleName

,LastName

,EmailAddress

,Phone

,YearlyIncome

,TotalChildren)

 

As

 

Select

Title

,FirstName

,MiddleName

,LastName

,EmailAddress

,Phone

,0

,0

From AdventureWorks.Person.Contact

 

Union All

Select

Title

,FirstName

,MiddleName

,LastName

,EmailAddress

,Phone

,YearlyIncome

,TotalChildren

From AdventureWorksDW..DimCustomer

This is particularly useful when the tables being joined have field names that are different. It’s easier to declare the field names for your view upfront and then not have to worry about renaming each field again throughout the script.

Finally, in the beginning of this article, I mentioned two types of Unions: “Union”; and “Union All”.

The difference between the two is that “Union All” will return all data from both tables, even if joining these tables will create duplicate records. If you want to filter duplicate records out, then you should use the “Union” function. But there is something else you should know about the “Union” function. The filtering of duplicate data means that SQL must check every record of each of the tables being merged for the duplicates. This uses more of the server’s resources and takes more time to complete, so the “Union All” is definitely the quicker, more efficient function of the two (even if there turns out to be no duplicates after all – SQL still has to check for that).

The other factor to consider, is that a duplicate record is considered a duplicate if all the fields of each record match e.g. Title, FirstName,MiddleName, LastName, EmailAddress, Phone, YearlyIncome, TotalChildren must all match in both tables. So in the second example above, SQL will probably deem there to be no duplicates because the first table (AdventureWorks.Person.Contact) has the “YearlyIncome” and “TotalChildren” fields built in. So unless the “YearlyIncome” and “TotalChildren” of AdventureWorksDW..DimCustomer is also zero for both fields the check for duplicates will not work!

I would always suggest that you start with “Union All” first, and then change to just “Union” if you spot any duplicate records in your query.

Aly Chiman

Aly Chiman is a Blogger & Reporter at AlyChiTech.com which covers a wide variety of topics from local news from digital world fashion and beauty . AlyChiTech covers the top notch content from the around the world covering a wide variety of topics. Aly is currently studying BS Mass Communication at University.