What's new in SQL Server 2016: Dynamic Data Masking
This post is part of the series 'Microsoft SQL Server 2016'. Be sure to check out the rest of the blog posts of the series!
- What's new in SQL Server 2016: JSON
- Comparing execution plans with SQL Server
- What's new in SQL Server 2016: Dynamic Data Masking (this post)
- What's new in SQL Server 2016: Row-Level Security
- What's new in SQL Server 2016: Stretch Database
- What's new in SQL Server 2016: Live Query Statistics
- What's new in SQL Server 2016: Temporal Tables
- What's new in SQL Server 2016: Always encrypted
- SQL Server 2016 SP1 - Licensing changes
The final version of SQL Server 2016 has been released recently. So it's time to take a look at the new features provided by this version. Today we will see Dynamic Data Masking. This feature already available on SQL Azure for some time is now available to SQL Server.
In some tables we may be brought to store sensitive information, or at least information that we do not want all users can access: blue card number, email address, etc. Currently to protect this data one can create views or stored procedures masking the desired columns, but this takes time and is difficult to set depending on the logged-in user.
Dynamic Data Masking tries to answer this need by allowing to hide the contents of certain columns according to the rights of the connected user. Before seeing how this is parameterized, here is the result of the same query with 2 different users not having the same permissions:
As we can see, the idea is not to remove columns, but to anonymize the content. For example, a telephone company may ask you for the last X digits of your bank card and verify that it is yours, but will not have access to the full amount. On the other hand, a user with sufficient rights will be able to see the data not masked.
#Setting up Mask
Before using this feature, activate it:
DBCC TRACEON(209,219,-1)
Let's create the table:
CREATE TABLE [dbo].[Customer](
[Id] [uniqueidentifier] NOT NULL,
[Name] [nvarchar](100) NOT NULL,
[CardNumber] [nchar](20) NULL,
[Email] [nvarchar](255) NULL
)
To enable data masking, you must change the definition of the desired columns:
ALTER TABLE Customer
ALTER COLUMN Name ADD MASKED WITH (FUNCTION = 'default()')
The replacement of the data is defined with FUNCTION = 'default()'
. 3 functions are available:
default()
: replace the value with "XXXX" for a character string or its default value (0 for a number, 1/1/0001 for a date, etc.)email()
: keeps the first letter of the email as well as the extension of the domainpartial(prefix_length, 'XXX', suffix_length)
: replace the text with the first N characters of the original string followed by the desired string ('XXX' in the example) followed by the N characters of the original string
For example to filter a credit card number one can use the following filter:
ALTER TABLE Customer
ALTER COLUMN CardNumber ADD MASKED WITH (FUNCTION = 'partial(0, "XXXX-XXXX-XXXX-", 4)')
-- XXXX-XXXX-XXXX-1234
For a phone number you can keep the first 4 digits:
ALTER TABLE Customer
ALTER COLUMN PhoneNumber ADD MASKED WITH (FUNCTION = 'partial(4, "000000", 0)')
-- 0175000000
For the email address we use the function dedicated to this use:
ALTER TABLE Customer
ALTER COLUMN Email ADD MASKED WITH (FUNCTION = 'email()')
-- cXXX@XXXX.com
#Set permissions
If you execute the SELECT * FROM Customer
query, you most likely do not see the masks. Indeed you surely have permission UNMASK
. We will create a new user without this permission and execute the query with his identity:
CREATE USER TestUser WITHOUT LOGIN;
GRANT SELECT ON Customer TO TestUser;
EXECUTE AS USER = 'TestUser';
SELECT * FROM Customer;
REVERT;
The data is hidden. Now add the UNMASK
permission to the user and look at the difference:
GRANT UNMASK TO TestUser;
EXECUTE AS USER = 'TestUser';
SELECT * FROM Customer;
REVERT;
This time the user can see the unmasked data.
#Conclusion
This new feature makes it easy to improve data privacy without changing the content of the database. According to the rights of the user, SQL Server will anonymize the data automatically according to the specified rules.
Do you have a question or a suggestion about this post? Contact me!