Database PostgreSQL

Oracle to PostgreSQL: Lessons from a 4TB Migration

Kloudpath Team December 18, 2024

The client was paying $1.2M per year in Oracle licensing for a 4TB database running on a two-node RAC cluster. The database powered a claims processing system with 340 tables, 1,200 stored procedures, 180 database triggers, and 45 materialized views. The goal was to migrate to Aurora PostgreSQL without disrupting the 24/7 claims processing operation. This is the story of how we did it in six months, and the specific technical challenges we encountered along the way.

Assessment with ora2pg

We started with ora2pg, the open-source Oracle-to-PostgreSQL migration tool. ora2pg analyzes an Oracle schema and produces a migration complexity report that scores each object type on a difficulty scale. For our database, the report estimated a total migration effort of approximately 15 person-months, which aligned with our own assessment.

The report highlighted several high-complexity areas: 87 stored procedures using Oracle-specific features (CONNECT BY hierarchical queries, BULK COLLECT, FORALL), 23 procedures using Oracle's DBMS_ packages (DBMS_OUTPUT, DBMS_LOB, DBMS_SCHEDULER), and 12 procedures that used dynamic SQL with EXECUTE IMMEDIATE. We used the ora2pg report to prioritize the migration: simple objects first to build momentum, complex objects last with dedicated engineering time.

# Run ora2pg assessment
ora2pg -t SHOW_REPORT -c ora2pg.conf

# Migration complexity summary:
#   Tables:           340  (estimated: 2 person-days)
#   Views:            89   (estimated: 5 person-days)
#   Materialized Views: 45 (estimated: 8 person-days)
#   Stored Procedures: 1200 (estimated: 120 person-days)
#   Triggers:         180  (estimated: 15 person-days)
#   Sequences:        210  (estimated: 1 person-day)

Schema Conversion: The Tricky Parts

ora2pg handles straightforward schema conversion automatically: Oracle's NUMBER becomes PostgreSQL's numeric, VARCHAR2 becomes varchar, DATE (which in Oracle includes a time component) becomes timestamp. But several patterns required manual intervention.

Sequences and triggers. Oracle uses sequences with BEFORE INSERT triggers to generate auto-incrementing primary keys. PostgreSQL has SERIAL and IDENTITY columns that handle this natively. We converted all sequence-trigger pairs to GENERATED ALWAYS AS IDENTITY columns, which is the modern PostgreSQL approach and eliminates the need for the trigger entirely.

Global temporary tables. Oracle's global temporary tables retain data for the duration of a session or transaction. PostgreSQL did not have native temporary table equivalents until recently, and even now the behavior differs. We replaced most global temporary tables with regular tables partitioned by session ID, with a cleanup job that purges expired sessions. For the few cases where transaction-scoped temporary data was needed, we used PostgreSQL's CREATE TEMPORARY TABLE within the stored procedure.

Synonyms. Oracle synonyms provide an alias for a schema object, often used to allow applications to reference objects without specifying the schema. PostgreSQL does not have synonyms. We replaced them with search_path configuration at the role level, so each application role has a search path that includes the schemas it needs.

PL/SQL to PL/pgSQL

The 1,200 stored procedures were the bulk of the migration effort. PL/SQL and PL/pgSQL are superficially similar but differ in important ways. Oracle's CONNECT BY hierarchical queries, used in 34 procedures, were rewritten using PostgreSQL's recursive CTEs. The conversion is mechanical but verbose.

-- Oracle: CONNECT BY hierarchical query
SELECT employee_id, manager_id, LEVEL
FROM employees
START WITH manager_id IS NULL
CONNECT BY PRIOR employee_id = manager_id;

-- PostgreSQL: Recursive CTE equivalent
WITH RECURSIVE org_tree AS (
  SELECT employee_id, manager_id, 1 AS level
  FROM employees
  WHERE manager_id IS NULL
  UNION ALL
  SELECT e.employee_id, e.manager_id, ot.level + 1
  FROM employees e
  JOIN org_tree ot ON e.manager_id = ot.employee_id
)
SELECT employee_id, manager_id, level FROM org_tree;

Oracle's BULK COLLECT and FORALL for batch processing have no direct equivalent in PL/pgSQL. PostgreSQL processes set-based operations more efficiently than Oracle in many cases, so we rewrote batch procedures to use set-based SQL statements instead of row-by-row processing. In several cases, a 50-line PL/SQL procedure with BULK COLLECT was replaced by a single INSERT ... SELECT statement that ran faster.

Exception handling also differs. Oracle's WHEN OTHERS THEN catch-all maps to PL/pgSQL's WHEN OTHERS THEN, but the error variable names are different (SQLERRM in Oracle versus SQLERRM in PL/pgSQL, which happens to be the same, but SQLCODE returns different values). We built a mapping table of Oracle error codes to PostgreSQL error codes and used it to validate that error handling logic was preserved.

Data Migration with AWS DMS

For the actual data migration, we used AWS Database Migration Service (DMS). DMS supports full-load migration from Oracle to PostgreSQL and can also replicate ongoing changes (CDC) during the transition period. We ran DMS in full-load-plus-CDC mode: it performed an initial bulk copy of all 4TB, then continuously replicated changes from Oracle to PostgreSQL while we validated the target database.

DMS is not without quirks. Large object columns (CLOBs and BLOBs) required special handling because DMS has a LOB size threshold. Objects smaller than the threshold are migrated inline; larger ones require a separate lookup that significantly slows the migration. We set the LOB threshold to 32KB, which covered 98% of our LOB data, and handled the remaining 2% with a custom migration script that ran after the DMS full load.

The CDC replication ran for three weeks while we validated the target database. During this period, both databases received writes (Oracle from production traffic, PostgreSQL from DMS replication), and we compared row counts, checksums, and sample data daily. When we were satisfied that the target was accurate, we scheduled a 30-minute maintenance window for the cutover.

Performance Tuning on PostgreSQL

After migrating the data, several queries that performed well on Oracle ran poorly on PostgreSQL. This is expected: the query optimizers are fundamentally different, and Oracle-specific hints do not translate. We spent two weeks on performance tuning, focusing on the top 20 queries by execution frequency.

The most common issue was missing indexes. Oracle's optimizer is more aggressive about using indexes for range scans, while PostgreSQL's optimizer sometimes prefers sequential scans on smaller tables. We added partial indexes for several queries that filtered on status columns (for example, CREATE INDEX idx_claims_pending ON claims(created_at) WHERE status = 'PENDING'), which dramatically improved queries that filter on a small subset of rows.

We also tuned PostgreSQL's configuration parameters. The defaults are conservative, and for a 4TB database on an r6g.4xlarge instance (128GB RAM), we adjusted shared_buffers to 32GB, effective_cache_size to 96GB, work_mem to 256MB, and maintenance_work_mem to 2GB. We enabled huge_pages and set random_page_cost to 1.1 (appropriate for SSD storage) to encourage the optimizer to use index scans.

Testing Methodology

We developed a three-tier testing strategy. The first tier was automated schema comparison: a script that compared every table, column, index, constraint, and stored procedure between Oracle and PostgreSQL to ensure completeness. The second tier was data validation: row counts, column checksums (using MD5 aggregation), and boundary value checks for every table. The third tier was functional testing: we replayed two weeks of production traffic (captured as SQL statements) against the PostgreSQL database and compared the results with Oracle's output.

The functional replay caught a subtle bug in date arithmetic. Oracle's SYSDATE + 30 adds 30 days, but the equivalent PostgreSQL expression NOW() + 30 adds 30 seconds. The correct PostgreSQL expression is NOW() + INTERVAL '30 days'. This affected 14 stored procedures and would not have been caught by schema or data validation alone.

Results

The migration was completed in six months. The cutover window was 28 minutes, during which we stopped writes to Oracle, verified the final CDC sync, updated the application connection strings, and started accepting traffic on PostgreSQL. No data was lost, and the application experienced no errors during the transition.

The annual database cost dropped from $1.2M (Oracle RAC licensing plus support) to approximately $180K (Aurora PostgreSQL compute and storage), a savings of 85%. Query performance on the top 20 queries improved by an average of 30%, partly due to PostgreSQL's more efficient handling of the rewritten set-based operations and partly due to Aurora's storage layer. The operational burden also decreased: Aurora handles replication, backups, and patching automatically, whereas the Oracle RAC cluster required dedicated DBA time for maintenance.

← Back to Blog