This pattern describes how to move data from an on-premises Oracle database to Amazon OpenSearch Service using Logstash. It includes architectural considerations and some required skill sets and recommendations. The data can be from a single table or from multiple tables in which a full-text search will need to be performed.
OpenSearch Service can be configured within a virtual private cloud (VPC), or it can be placed publicly with IP-based restrictions. This pattern describes a scenario where OpenSearch Service is configured within a VPC. Logstash is used to collect the data from the Oracle database, parse it to JSON format, and then feed the data into OpenSearch Service.
Prerequisites
An active AWS account
Java 8 (required by Logstash 6.4.3)
Connectivity between the on-premises database servers and Amazon Elastic Compute Cloud (Amazon EC2) instances in a VPC, established using AWS Virtual Private Network (AWS VPN)
A query to retrieve the required data to be pushed to OpenSearch Service from the database
Oracle Java Database Connectivity (JDBC) drivers
Limitations
Logstash cannot identify records that are hard-deleted from the database
Product versions
Oracle Database 12c
OpenSearch Service 6.3
Logstash 6.4.3
Source technology stack
On-premises Oracle database
On-premises AWS VPN
Target technology stack
VPC
EC2 instance
OpenSearch Service
Logstash
NAT Gateway (for operating system updates on EC2 instances and to install Java 8, Logstash, and plugins)
Data migration architecture