MySQL datamasking using ProxySQL

lefred
October 7, 2016
Conference, General, HA, MySQL, ProxySQL
13 Comments

Percona Live Amsterdam is just finished… there was an entire room full day about MySQL 8.0 which is wonderful but there was also another product that everybody was talking about: ProxySQL (great job René).

Ronald Bradford and the MySQL engineering team were discussing about what tool was missing in the MySQL ecosystem and Mark Leith highlighted that a datamasking utility was really missing.

Having used already ProxySQL for Group Replication and other tests, I knew that it would be possible to use it also for this purpose. After a very short discussion and a nightly chat with René, this is an example on how you can achieve this missing (and sometime heavy) task.

Description:

one CUSTOMERS table with sensitive information like Credit Card numbers (those are fake)
a developer user that doesn’t need to access those numbers

So we decided this:

the developer can only use SELECT (and EXECUTE)
the developer can access all columns
but credit card numbers (cc_num) must be masked, but the last 4 digits
the developer can’t perform SELECT * on that specific table

This is our table:

+----+-----------+-------------+------------+------------------+----------+
| id | firstname | lastname    | cc_type    | cc_num           | cc_verif |
+----+-----------+-------------+------------+------------------+----------+
|  1 | Frederic  | Descamps    | mastercard | 5275653223285289 |      456 |
|  8 | Dim0      | Vanoverbeke | mastercard | 5345654523285289 |      123 |
| 15 | Kenny     | Gryp        | visa       | 4916066793184589 |      456 |
+----+-----------+-------------+------------+------------------+----------+

So first we will create on our database server the developer’s user:

MySQL> GRANT SELECT, USE on myapp.* to 'devel'@'%' identified by 'password';

Now let’s configure ProxySQL (I assume here that you already have installed and started proxysql) from it’s admin interface:

# mysql -u admin -padmin -h 127.0.0.1 -P6032

The first step is to create a user here too:

ProxySQL> INSERT INTO mysql_users(username,password,default_hostgroup) 
          VALUES ('devel','password',1);

and now our MySQL server if it's not yet referenced:

ProxySQL> INSERT INTO mysql_servers(hostgroup_id,hostname,port) 
          VALUES (1,'mysql1',3306);

We can load those settings in runtime and then save then to disk too:

ProxySQL> LOAD MYSQL SERVERS TO RUNTIME;
ProxySQL> LOAD MYSQL USERS TO RUNTIME;
ProxySQL> SAVE SERVERS USERS TO DISK;
ProxySQL> SAVE MYSQL USERS TO DISK;

We can already try and see that our developer can still read our data. Let’s work on this now.

ProxySQL> INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,error_msg)     
          VALUES (90,1,'devel','^SELECT \*.*FROM.*CUSTOMERS',   
          'Query not allowed due to sensitive information, please contact dba@myapp.com');

Let’s load it in runtime and test:

ProxySQL> LOAD MYSQL QUERY RULES TO RUNTIME;

# mysql -u devel -ppassword -h 127.0.0.1 -P 6033

mysql> select * from myapp.CUSTOMERS;
ERROR 1148 (42000): Query not allowed due to sensitive information, please contact dba@myapp.com

Wooohooo it worked 😉

And the datamasking ? This is our next step, let’s enter the following rule:

ProxySQL> INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,replace_pattern,apply)
          VALUES (1,1,'devel','^[sS][eE][lL][eE][cC][tT] (.*)cc_num([ ,])(.*)', 
                "SELECT \1CONCAT(REPEAT('X',12),RIGHT(cc_num,4)) cc_num\2\3",1);

ProxySQL> LOAD MYSQL QUERY RULES TO RUNTIME;

We can test it again:

mysql> select firstname, cc_num from myapp.CUSTOMERS;
+-----------+------------------+
| firstname | cc_num           |
+-----------+------------------+
| Frederic  | XXXXXXXXXXXX5289 |
| Dim0      | XXXXXXXXXXXX5289 |
| Kenny     | XXXXXXXXXXXX4589 |
+-----------+------------------+

Wooohoooo it worked again 😉

Note: if you find ways to workaround the datamasking, let me know, thank you.

Don’t forget to also save your rules to disk:

ProxySQL> SAVE MYSQL QUERY RULES TO DISK;

As you can see it’s possible to have a very quick datamasking solution for MySQL using ProxySQL.

13 Comments

Francisco Miguel Biete

October 7, 2016 / 11:55 Reply

Would it work with a “select *”?

I think that a easier solution would be create a view “masked_customers” with the cc masked and rewrite select/update/delete/inserts to that view for “devel”.

Regards,
- lefred
  
  October 7, 2016 / 12:39 Reply
  
  Hi Francisco,
  
  As written in the article, select * is not allowed for that table.
  
  View can indeed be an option but it can also cause performance issue in certain circumstances.
Thomas

October 7, 2016 / 17:07 Reply

> if you find ways to workaround the datamasking, let me know, thank you.

Wouldn’t queries like these work :
* select firstname, cc_num, cc_num from myapp.CUSTOMERS;
* select firstname, `cc_num` from myapp.CUSTOMERS;
* select firstname, cc_num
from myapp.CUSTOMERS;
- lefred
  
  October 7, 2016 / 18:58 Reply
  
  Hey Thomas !
  
  Thank you for this feedback and indeed you are right, good catch, let me try to improve the regexp !
  
  Cheers
MySQL Datamasking using ProxySQL – part 2 – lefred's blog: tribulations of a MySQL Evangelist

October 7, 2016 / 21:01 Reply

[…] part 1, we saw how to mask one column for a specific […]
Dipti Joshi

November 2, 2016 / 03:57 Reply

What will happen to these queries
(1) select firstname, /* cc_num */, from myapp.CUSTOMERS;
(2) select a.* from myapp.CUSTOMERS a
(3) /* */ select firstname, cc_num from myapp.CUSTOMERS;
- lefred
  
  November 2, 2016 / 12:44 Reply
  
  Hi Dipti,
  
  Thank you for these examples, I’m currently working on new implementation. That will help me to improve it !
David Bennett

May 18, 2017 / 18:14 Reply

In addition, a user with the right permissions could create a view and access the data indirectly…

create view bleh as select * from customer;
select * from bleh;
ANKUR TRIVEDI

June 21, 2017 / 14:46 Reply

i want to implementation of data masking using proxysql in linux o/s.i have created db and tables from my root and create a db_user also and granted all privileges to this user,first one i login into admin and create table mysql_users and mysql_servers also and insert the values from proxysql> prompt.then i load and save it.then after i have inserted mysql query rules on my created table (that i was created from root ).then load mysql query rules to runtime.after that i log into my db_user and created and insert the mysql_query_rules,but when i load mysql query rules for runtime….ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘MYSQL QUERY RULES TO RUNTIME’ at line 1 this error has come.
so please suggest me what i do ??
Oliver

October 24, 2017 / 11:52 Reply

Hi Lefred,

Really good article and very well explained and formatted well done!

I’ve implemented on one environment I have and found that when you break the lines like

SELECT
firstname,
cc_num,
cc_num
FROM
myapp.CUSTOMERS;

This will show the data without masking it. It may be something simple like adding line breaks to the regex to sort this out.

Thank you.
- lefred
  
  October 24, 2017 / 14:51 Reply
  
  Thank you Olivier for this use case.
Stefan

March 20, 2018 / 17:56 Reply

Hello,
thanks for the nice writeup. That was really helpful!

Today I played around with the regular expression for blocking all SELECT * type queries. This seems to work for quite a few standard cases:

INSERT INTO mysql_query_rules (rule_id,active,username,match_pattern,error_msg)
VALUES (1,1,’proxy_sql’,’^(?i)(SELECT)(\s)*((?i)\w*(.)?\*)*(\s)*(?i)(FROM)(\s)*(?i)(\w*).*(?i)(\w*)’,
‘Query not allowed due to sensitive information, please contact security@acme.org‘)

The (?i) part is responsible for ignoring case sensitivity.
Pawan

January 13, 2022 / 12:59 Reply

There seems a typo in Save syntax should be

ProxySQL> SAVE MYSQL SERVERS TO DISK;
ProxySQL> SAVE MYSQL USERS TO DISK;

Instead of

ProxySQL> SAVE SERVERS USERS TO DISK;

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

MySQL datamasking using ProxySQL

follow me

Subscribe to Blog via Email

13 Comments

Leave a ReplyCancel Reply