Sebastian
After a little research on DataType ntext I found the following information which might explain my problem. Any suggestions that might help?
Tom
_________________________________________________________________________________________________________
After much researching it turns out the problem is that the column the text was stored in was an NTEXT, but SQL 2005 has deprecated the NTEXT in favor of NVARCHAR(MAX). Now, you'd think that string functions on NTEXT would have the same performance on 2005 as it did on 2000, but thats not the case.
Ok, so NTEXT is old badness, and NVARCHAR(MAX) is new goodness. Then the next logical step would be to convert the column to be a NVARCHAR(MAX) data type, but here lies a little but very important gotcha.
By default NTEXT stores the text value in the LOB structure and the table structure just holds a pointer to the location in the LOB where the text lives.
Conversely, the default setting for NVARCHAR(MAX) is to store its text value in the table structure, unless the text is over 8,000 bytes at which point it behaves like an NTEXT and stores the text value in the LOB , and stores a pointer to the text in the table.
So, just to recap, the default settings for NTEXT and NVARCHAR(MAX) are completely opposite.
Now, what do you think will happen when you execute an ALTER COLUMN on a NTEXT column that changes the data type to a NVARCHAR(MAX)? Where do you think the data will be stored? In the LOB structure or the table structure?
Well, lets walk through an example. First create a table with one NTEXT column:
CREATE TABLE [dbo].[testTable](
[testText] [ntext] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Next, put 20 rows in the table:
INSERT INTO testTable SELECT 'hmmm...i wonder if this will work'
Then run a select query with IO STATISTICS:
SET STATISTICS IO ON
SELECT * FROM testTable
SET STATISTICS IO OFF
Now, looking at the IO stats, we see there was only 1 logical read, but 60 LOB logical reads. This is pretty much as expected as NTEXT stores its text value in the LOB not the table:
Table 'testTable'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 60, lob physical reads 0, lob read-ahead reads 0.
Now, lets alter the table to be an NVARCHAR(MAX):
ALTER TABLE testTable ALTER COLUMN testText NVARCHAR(MAX) null
Now when we run the select query again with UI STATISTICS we still get a lot of LOB reads (though less than we did with NTEXT). So its obvious that when SQL Server did the alter table, it didn't use the default NVARCHAR(MAX) setting of text in row, but kept the text in the LOB and still uses pointers lookups to get the text out of the LOB.
Table 'testTable'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 40, lob physical reads 0, lob read-ahead reads 0.
This is not as expected and can be devastating for performance if you don't catch it, since NVARCHAR(MAX) with text not in row actually performs WORSE than NTEXT when doing SUBSTRING calls.
So how do we fix this problem? Its actually fairly easy. After running your alter table, run an update statement setting the column value to itself, like so:
UPDATE testTable SET testText = testText
SQL server moves the text from the LOB structure to the table (if less than 8,000 bytes). So when we run the select again with IO STATISTICS we get 0 LOB reads.
Table 'testTable'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
YEA! This is what we want.
Now, just for grins, what do you think happens if we change the NVARCHAR(MAX) back to NTEXT? Well it turns out that SQL Server moves the text back to the LOB structure. Completely backwards from what it did when converting NTEXT to NVARCHAR(MAX).
____________________________________________________________________________
Important:
ntext, text, and image data types will be removed in a future version of Microsoft SQL Server. Avoid using these data types in new development work, and plan to modify applications that currently use them. Use nvarchar(max), varchar(max), and varbinary(max) instead.
Fixed and variable-length data types for storing large non-Unicode and Unicode character and binary data. Unicode data uses the UNICODE UCS-2 character set.
________________________________________________________________________
As of SQL Server 2005, it is recommended that you use VARCHAR(MAX) which expands the storage capacity from 8,000 characters to 2,147,483,648 characters.