Load data from S3 to Oracle on Amazon RDS
2017/Aug/14If you have text files stored in S3 buckets and want to load them to say Amazon Aurora or Redshift, you can just issue commands such as LOAD DATA FROM S3 or COPY TABLE FROM S3. For an Oracle instance on Amazon RDS though, the process isn't that straight forward. AWS documentation provides some high level of details here. In this post, I am going to show the approach to use SQL*Loader to load the data from S3 to Oracle on Amazon RDS.
Here are the steps:
- Download and install Oracle Instant Client tools on a standalone EC2 instance.
- Copy a text file from S3 to local directory using AWS CLI.
- Create a Control file, use SQL*Loader to load the text file to Oracle on Amazon RDS.
First step is to download and install Oracle Instant Client tools on to the EC2 instance. Oracle doesn't provide a direct URL for download, such as you can use with wget or curl, you need to accept the license agreement and then get the download URL. This is a nice trick to achieve this, I followed the Method 1 described to download the Instant client tools on to the EC2 instance. The EC2 instance I provisioned is an Amazon Linux AMI, so I am downloading and installing two packages (Instant Client Package - Basic and Tools) from here.
Follow the below steps:
The next step is to download the file from S3 bucket to a local folder using AWS CLI. Create a control file to load the data from this local file on to the table on Oracle RDS.
If you don't run the below command, you will get a host name not set error. More details here.
That's all. SQL*Loader is successfully installed, you can use the Oracle EZConnect method to run the command.
.