nvarchar(n) vs nvarchar(max) performance in MS-SQL Server

Yesterday, I received a question about the difference between nvarchar(n) and nvarchar(max).

  • Why don’t we always use nvarchar(max)?
  • Are nvarchar(n) and nvarchar(max) different in performance ?
  • Are nvarchar(n) and nvarchar(max) different in storage size ?

NVARCHAR(N)

First of all, What is nvarchar(n)?

nvarchar [ ( n | max ) ]
Variable-length Unicode string data. n defines the string length and can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size, in bytes, is two times the actual length of data entered + 2 bytes. The ISO synonyms for nvarchar are national char varying and national character varying.

(From:  Microsoft Doc – nchar and nvarchar)

“n” in nvarchar(n) define the string length from 1 to 4,000. If your string is longer than 4,000 characters, you can define nvarchar(max) instead.

So, I want to store a string with 8 characters: “hungdoan”. What is the difference between nvarchar(8) and nvarchar(4000) ?

The answer is that: there is no difference in performance and storage size between varchar (8) and varchar (4000).

However, do not use a large “n”. Just use a small enough. Because the actual size of your string will impact your indexing in terms of performance, and your string can’t be indexed if It’s greater than 900 bytes (450 Unicode characters)

NVARCHAR(N) vs NVARCHAR(MAX)

There is an interesting thing is that: if you change nvarchar(8) or nvarchar(4000) to nvarchar(max). There is a difference in terms of performance & storage size. 

Wow, why is this happen? It’s because of the difference between the storage method in MS-SQL.

In MS-SQL Server all data rows are stored in Pages.

Data rows are put on the page serially, starting immediately after the header. A row offset table starts at the end of the page, and each row offset table contains one entry for each row on the page. Each entry records how far the first byte of the row is from the start of the page. The entries in the row offset table are in reverse sequence from the sequence of the rows on the page.

From: Microsoft Doc – Understanding Pages and Extents

Your table is stored as a collection of 8-KB pages.

(From Microsoft Doc  – Table and Index Organization)

In the image above you can see that there are 3 types of allocation Units

  • Data (IN_ROW_DATA) : Data or index rows that contain all data, except large object (LOB) data.
  • LOB (LOB_DATA): Large object data stored in one or more of these data types: text, ntext, image, xml, varchar(max), nvarchar(max), varbinary(max), or CLR user-defined types (CLR UDT).
  • Row overflow (ROW_OVERFLOW_DATA): Variable length data stored in varchar, nvarchar, varbinary, or sql_variant columns that exceed the 8,060 byte row size limit.

NVARCHAR(MAX) will be slower!

You can see that: If we use nvarchar(max) your data will be stored in LOB_DATA instead of IN_ROW_DATA or ROW_OVERFLOW_DATA, this is the main reason that cause the performance difference.

By default, all rows are stored in IN_ROW_DATA, which gives you a great performance.

But in some cases, if your row is larger than 8KB or your data is LOB_DATA  – nvarchar(max). Then MS-SQL will split your columns into another area (LOB_DATA or ROW_OVERFLOW_DATA) and leave a pointer in IN_ROW_DATA to point to LOB_DATA/ROW_OVERFLOW_DATA. 

That is the reason why nvarchar(max) will be slower than nvarchar(n) as it would take more steps to lookup our data.

About the storage size, there is a tiny different: extra bytes to a pointer. So, you do not need to worry about the storage impact between nvarchar(n) and nvarchar(max).

Conclusion

Why don’t we always use nvarchar(max)  ?

  • nvarchar(max) is slower than nvarchar(n).
  • you can’t index nvarchar(max) column

Are nvarchar(n) and nvarchar(max) different in performance?

  • Yes, nvarchar(max) will be slower than nvarchar(n) because its storage structure requires more steps to organize and query data.

Are nvarchar(n) and nvarchar(max) different in storage size ?

  • Yes, nvarchar(max) will add some extra bytes for pointers. But it’s not a big deal to worry about (in some cases, if your row size is larger than 8KB the nvarchar(n) will add some extra byte for pointers as well)

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

I’m Hung

Welcome to my personal space. This is a small corner where I share ideas and things I find interesting,..

Let’s connect

Don’t hesitate to reach out if there’s anything we can discuss, or simply to connect :)