How to import data from PostgreSQL to MySQL Database Service

MySQL Database Service (aka MDS) is very popular and many users wants to benefit from the managed MySQL service and from MySQL HeatWave.

We see many users migrating their data from MySQL on-premise or from another cloud vendor to MDS… we also see people migrating from other data-stores like PostgreSQL.

In this article we will see how to migrate data from PostgreSQL to MDS using CVS files and MySQL Shell Utility.

When we want to copy data from one database to another one, we need to split the data structure and the actual data.

For this example, we will use PosgreSQL sample database called DVDrental.

Tables Definition

The first task consists in getting the table definition out of PostgreSQL.

The most popular option is to use pg_dump, but let’s have a look at it when using the first table (actor):

-bash-4.2$ pg_dump -st actor dvdrental
--
-- PostgreSQL database dump
--

-- Dumped from database version 12.8
-- Dumped by pg_dump version 12.8

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;

SET default_tablespace = '';

SET default_table_access_method = heap;

--
-- Name: actor; Type: TABLE; Schema: public; Owner: postgres
--

CREATE TABLE public.actor (
    actor_id integer DEFAULT nextval('public.actor_actor_id_seq'::regclass) NOT NULL,
    first_name character varying(45) NOT NULL,
    last_name character varying(45) NOT NULL,
    last_update timestamp without time zone DEFAULT now() NOT NULL
);


ALTER TABLE public.actor OWNER TO postgres;

--
-- Name: actor actor_pkey; Type: CONSTRAINT; Schema: public; Owner: postgres
--

ALTER TABLE ONLY public.actor
    ADD CONSTRAINT actor_pkey PRIMARY KEY (actor_id);


--
-- Name: idx_actor_last_name; Type: INDEX; Schema: public; Owner: postgres
--

CREATE INDEX idx_actor_last_name ON public.actor USING btree (last_name);


--
-- Name: actor last_updated; Type: TRIGGER; Schema: public; Owner: postgres
--

CREATE TRIGGER last_updated BEFORE UPDATE ON public.actor FOR EACH ROW EXECUTE FUNCTION public.last_updated();


--
-- PostgreSQL database dump complete
--

From that output, to import the data only the text in orange is relevant to us (eventually the indexes can also be created later).

If you are familiar with MySQL, you can directly identify that the SQL syntax returned won’t work in MySQL.

Let’s have a look at the table’s deinition:

CREATE TABLE public.actor (
    actor_id integer DEFAULT nextval('public.actor_actor_id_seq'::regclass) NOT NULL,
    first_name character varying(45) NOT NULL,
    last_name character varying(45) NOT NULL,
    last_update timestamp without time zone DEFAULT now() NOT NULL
);

We can see that:

there is a public. prefix
the primary key is not defined here (but it’s defined later as a constraint)
a user defined type is used: public.actor_id_seq, this is sequence
character varying is actually VARCHAR
the timestamp field uses some extra information not known in MySQL: without time zone

That statement must be rewritten to work in MySQL, this is a manual step:

CREATE TABLE actor (
    actor_id integer auto_increment NOT NULL primary key,
    first_name varchar(45) NOT NULL,
    last_name varchar(45) NOT NULL,
    last_update timestamp DEFAULT now() NOT NULL
);

You can also use the PosrgreSQL Type Mapping from the manual.

We can enter that new rewritten statement in our MySQL Database Service instance and add the secondary indexes:

We need to perform such exercise for all tables present in the dvdrental schema:

dvdrental-# \dt
             List of relations
 Schema |     Name      | Type  |  Owner   
--------+---------------+-------+----------
 public | actor         | table | postgres
 public | address       | table | postgres
 public | category      | table | postgres
 public | city          | table | postgres
 public | country       | table | postgres
 public | customer      | table | postgres
 public | film          | table | postgres
 public | film_actor    | table | postgres
 public | film_category | table | postgres
 public | inventory     | table | postgres
 public | language      | table | postgres
 public | payment       | table | postgres
 public | rental        | table | postgres
 public | staff         | table | postgres
 public | store         | table | postgres
(15 rows)

This is the most complex part when multiple USER-DEFINED data types are used like in the table film:

rating public.mpaa_rating DEFAULT 'G'::public.mpaa_rating,

What is that mpaa_rating ?

You can retrieve it’s definition like this:

dvdrental=# select enum_range(null::mpaa_rating);
      enum_range      
----------------------
 {G,PG,PG-13,R,NC-17}
(1 row)

We can see that this is an ENUM and we can then rename that column like this:

rating ENUM('G','PG','PG-13','R','NC-17') DEFAULT 'G',

Time for the Data

Now it’s time to export the data from PostgreSQL to CVS:

dvdrental=# \copy actor to '/vagrant/output/actor.csv' delimiter ',' CSV HEADER;
COPY 200
dvdrental=# \copy address to '/vagrant/output/address.csv' delimiter ',' CSV HEADER;
COPY 603
dvdrental=# \copy category to '/vagrant/output/category.csv' delimiter ',' CSV HEADER;
COPY 16
dvdrental=# \copy customer to '/vagrant/output/customer.csv' delimiter ',' CSV HEADER;
COPY 599
dvdrental=# \copy film to '/vagrant/output/film.csv' delimiter ',' CSV HEADER;
COPY 1000

When all the table we want to import are exported, we can use MySQL Shell with the import table utility:

importTable() can work in parallel ! (not here as the data is too small)

And we repeat that operation for all tables we want to import.

importTable() can also import multiple files in one single table !

We can verify that the data has been imported:

Conclusion

Importing CSV files into a MDS instance is very easy and efficient using MySQL Shell. It can be used to import data from PostgreSQL but also from other RDBMS.

Of course depending of the structure of the data, some initial work might be required, especially when some data types are different.

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Tables Definition

Time for the Data

Conclusion

follow me

Subscribe to Blog via Email

4 Comments

Leave a Reply to How to import data from Amazon Redshift to MySQL Database Service – lefred's blog: tribulations of a MySQL EvangelistCancel Reply