How to import data from PostgreSQL to MySQL Database Service

MySQL Database Service (aka MDS) is very popular and many users wants to benefit from the managed MySQL service and from MySQL HeatWave.

We see many users migrating their data from MySQL on-premise or from another cloud vendor to MDS… we also see people migrating from other data-stores like PostgreSQL.

In this article we will see how to migrate data from PostgreSQL to MDS using CVS files and MySQL Shell Utility.

When we want to copy data from one database to another one, we need to split the data structure and the actual data.

For this example, we will use PosgreSQL sample database called DVDrental.

Tables Definition

The first task consists in getting the table definition out of PostgreSQL.

The most popular option is to use pg_dump, but let’s have a look at it when using the first table (actor):

-bash-4.2$ pg_dump -st actor dvdrental
--
-- PostgreSQL database dump
--

-- Dumped from database version 12.8
-- Dumped by pg_dump version 12.8

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;

SET default_tablespace = '';

SET default_table_access_method = heap;

--
-- Name: actor; Type: TABLE; Schema: public; Owner: postgres
--

CREATE TABLE public.actor (
    actor_id integer DEFAULT nextval('public.actor_actor_id_seq'::regclass) NOT NULL,
    first_name character varying(45) NOT NULL,
    last_name character varying(45) NOT NULL,
    last_update timestamp without time zone DEFAULT now() NOT NULL
);


ALTER TABLE public.actor OWNER TO postgres;

--
-- Name: actor actor_pkey; Type: CONSTRAINT; Schema: public; Owner: postgres
--

ALTER TABLE ONLY public.actor
    ADD CONSTRAINT actor_pkey PRIMARY KEY (actor_id);


--
-- Name: idx_actor_last_name; Type: INDEX; Schema: public; Owner: postgres
--

CREATE INDEX idx_actor_last_name ON public.actor USING btree (last_name);


--
-- Name: actor last_updated; Type: TRIGGER; Schema: public; Owner: postgres
--

CREATE TRIGGER last_updated BEFORE UPDATE ON public.actor FOR EACH ROW EXECUTE FUNCTION public.last_updated();


--
-- PostgreSQL database dump complete
--

From that output, to import the data only the text in orange is relevant to us (eventually the indexes can also be created later).

If you are familiar with MySQL, you can directly identify that the SQL syntax returned won’t work in MySQL.

Let’s have a look at the table’s deinition:

CREATE TABLE public.actor (
    actor_id integer DEFAULT nextval('public.actor_actor_id_seq'::regclass) NOT NULL,
    first_name character varying(45) NOT NULL,
    last_name character varying(45) NOT NULL,
    last_update timestamp without time zone DEFAULT now() NOT NULL
);

We can see that:

  • there is a public. prefix
  • the primary key is not defined here (but it’s defined later as a constraint)
  • a user defined type is used: public.actor_id_seq, this is sequence
  • character varying is actually VARCHAR
  • the timestamp field uses some extra information not known in MySQL: without time zone

That statement must be rewritten to work in MySQL, this is a manual step:

CREATE TABLE actor (
    actor_id integer auto_increment NOT NULL primary key,
    first_name varchar(45) NOT NULL,
    last_name varchar(45) NOT NULL,
    last_update timestamp DEFAULT now() NOT NULL
);

You can also use the PosrgreSQL Type Mapping from the manual.

We can enter that new rewritten statement in our MySQL Database Service instance and add the secondary indexes:

We need to perform such exercise for all tables present in the dvdrental schema:

dvdrental-# \dt
             List of relations
 Schema |     Name      | Type  |  Owner   
--------+---------------+-------+----------
 public | actor         | table | postgres
 public | address       | table | postgres
 public | category      | table | postgres
 public | city          | table | postgres
 public | country       | table | postgres
 public | customer      | table | postgres
 public | film          | table | postgres
 public | film_actor    | table | postgres
 public | film_category | table | postgres
 public | inventory     | table | postgres
 public | language      | table | postgres
 public | payment       | table | postgres
 public | rental        | table | postgres
 public | staff         | table | postgres
 public | store         | table | postgres
(15 rows)

This is the most complex part when multiple USER-DEFINED data types are used like in the table film:

rating public.mpaa_rating DEFAULT 'G'::public.mpaa_rating,

What is that mpaa_rating ?

You can retrieve it’s definition like this:

dvdrental=# select enum_range(null::mpaa_rating);
      enum_range      
----------------------
 {G,PG,PG-13,R,NC-17}
(1 row)

We can see that this is an ENUM and we can then rename that column like this:

rating ENUM('G','PG','PG-13','R','NC-17') DEFAULT 'G',

Time for the Data

Now it’s time to export the data from PostgreSQL to CVS:

dvdrental=# \copy actor to '/vagrant/output/actor.csv' delimiter ',' CSV HEADER;
COPY 200
dvdrental=# \copy address to '/vagrant/output/address.csv' delimiter ',' CSV HEADER;
COPY 603
dvdrental=# \copy category to '/vagrant/output/category.csv' delimiter ',' CSV HEADER;
COPY 16
dvdrental=# \copy customer to '/vagrant/output/customer.csv' delimiter ',' CSV HEADER;
COPY 599
dvdrental=# \copy film to '/vagrant/output/film.csv' delimiter ',' CSV HEADER;
COPY 1000

When all the table we want to import are exported, we can use MySQL Shell with the import table utility:

importTable() can work in parallel ! (not here as the data is too small)

And we repeat that operation for all tables we want to import.

importTable() can also import multiple files in one single table !

We can verify that the data has been imported:

Conclusion

Importing CSV files into a MDS instance is very easy and efficient using MySQL Shell. It can be used to import data from PostgreSQL but also from other RDBMS.

Of course depending of the structure of the data, some initial work might be required, especially when some data types are different.

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

4 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

As MySQL Community Manager, I am an employee of Oracle and the views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

You can find articles I wrote on Oracle’s blog.