Not all special characters cause an issue especially if most XML is generated using FOR XML PATH which takes care of encoding in 99% cases and encode any special characters not permitted in an XML document. The TYPE directive also helps as it creates the XML datatype.
CDATA or validation function? Neither scenario is pretty or quick.
CDATA can only be used with FOR XML EXPLICIT. CDATA is not preserved when used with FOR XML PATH. This would mean inserting EXPLICIT modes into existing FOR XML PATH code.
The only scenarios where FOR XML PATH failed (for me) is when using CHAR(25) and CHAR(147). We could use REPLACE or CDATA to resolve the issue.
Creating some REPLACE function seems easier to implement but we’d need to decide what special characters will be replaced.
CDATA is more work to implement, but we would not have to worry about what special character we’re dealing with.
I don’t really provide any solution here… it’s just some research I’ve done at the time.
CREATE TABLE #Temp (TempId INT, specialChar varchar(500), specialCharXml XML )
INSERT INTO #Temp (TempId, specialChar) VALUES (1, '€ Euro, à A tilde, † dagger, Š S caron (hacek), œ oe, ©, ¶ , » ÿ ёйцбджюхъэ È')
INSERT INTO #Temp (TempId, specialChar) VALUES (2, '
- #x9 #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] ')
INSERT INTO #Temp (TempId, specialChar) VALUES (3, '7308098ahdjkaikdmlhasl ')
INSERT INTO #Temp (TempId, specialChar) VALUES (4, N'')
INSERT INTO #Temp (TempId, specialChar) VALUES (5, '"`1234567890-=¬!"£$%^&*()_+¦€qwertyuiop[]QWERTYUIOP{}asdfghjkl;'#ASDFGHJKL:@~\zxcvbnm,./|ZXCVBNM<>?/*-7894561230." />')
INSERT INTO #Temp (TempId, specialChar) VALUES (6, char(25))
INSERT INTO #Temp (TempId, specialChar) VALUES (7, char(147))
INSERT INTO #Temp (TempId, specialChar) VALUES (8, 'Some Value ' + CHAR(25) + 'Some OtherValue')
INSERT INTO #Temp (TempId, specialChar) VALUES (9, '♣ ◊ ♦')
INSERT INTO #Temp (TempId, specialChar) VALUES (10,'`1234567890-=¬!"£$%^&*()_+¦€qwertyuiop[]QWERTYUIOP{}asdfghjkl;'#ASDFGHJKL:@~\zxcvbnm,./|ZXCVBNM<>?/*-7894561230.')
SELECT REPLACE(REPLACE(specialChar, CHAR(25), ''), CHAR(147), '')
FROM #Temp
FOR XML PATH ('Row'), ROOT ('Data'), TYPE
DECLARE @TempId INT, @XML XML
SET @TempId = 1
WHILE (@TempId < 11)
BEGIN
SET NOCOUNT ON
SET @XML = NULL
BEGIN TRY
SET @XML = (
SELECT *
FROM #Temp
WHERE TempId = @TempId
FOR XML PATH ('Row'), ROOT ('Data'), TYPE -- FAILS char(25), char(147)
)
SELECT @TempId as TempID, @XML AS MyXML, @@ERROR AS Error
END TRY
BEGIN CATCH
SELECT @TempId as TempID, @XML AS MyXML, @@ERROR AS Error
END CATCH
SET @TempId = @TempId + 1
END
SELECT 1 as Tag,
NULL as Parent,
specialChar as [temp!1!specialChar],
specialChar as [temp!1!!CDATA], -- FAILS IF THIS COLUMN INCLUDED
'some description' as [temp!1!!CDATA]
FROM #Temp FOR XML EXPLICIT, TYPE
-- successful
SELECT 1 As Tag,
NULL as Parent,
(SELECT specialChar as 'data()'
FROM #Temp Temp1
WHERE Temp1.TempId = Temp2.TempId
FOR XML PATH('')) AS 'temp!1!specialChar!cdata'
FROM #Temp Temp2
FOR XML Explicit
-- successful
SELECT 1 As Tag,
NULL as Parent,
(SELECT specialChar as 'data()'
FROM #Temp Temp1
WHERE Temp1.TempId = Temp2.TempId
FOR XML PATH('')) AS 'temp!1!specialChar!cdata'
FROM #Temp Temp2
FOR XML Explicit, TYPE
-- fails
SELECT * FROM #Temp WHERE TempId = 8 FOR XML PATH ('Row'), ROOT ('Data') -- DOES NOT FAILS char(25)
SELECT * FROM #Temp WHERE TempId = 8 FOR XML PATH ('Row'), ROOT ('Data'), TYPE -- FAILS char(25)
-- able to reproduce if directly assigning varchar to an xml
DECLARE @xml xml;
SET @xml=N'';
select @xml
DROP TABLE #Temp
-- EXPLICIT Mode: http://msdn.microsoft.com/en-us/library/ms189068.aspx
-- Encoding: http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Character_entities_in_XML
-- Invalid characters and escape rules: http://msdn.microsoft.com/en-us/library/bb500235.aspx
-- Remove Special Characters function: http://geekswithblogs.net/Gaurav/archive/2008/04/01/120947.aspx
-- Remove Special Characters: http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SQL-Server-2005/Q_24583739.html