Migrating from Digital Ocean to AWS: A Simple Guide for Research Projects

🧭 Migrating from Digital Ocean to AWS: A Simple Guide for Research Projects
Client Sector: Marine Conservation & Research
Client: University of Guam (UOG) Marine Lab and Micronesia Coral Reef Monitoring Network
Service Type: Full-Stack System Modernization
Technologies Used: React, Django, R Shiny, Angular, Python, Pandas, NumPy, SciPy
🌐 Project Overview
To support coral reef conservation across Micronesia, we modernized a legacy data platform used by marine researchers and regional policy makers. The system aggregates data from field surveys and oceanographic sensors into a unified web interface. Our goal was to create a cloud-native, modular, and maintainable architecture that scales with growing data needs and offers operational resilience.
🌊 Why We Migrated from Digital Ocean
Digital Ocean served well during early development but posed growing limitations:
- Lack of fully managed services for databases and load balancing
- Limited options for fine-grained IAM and networking controls
- Difficulty in orchestrating multi-service deployments
- Manual setup hampering reproducibility
AWS addressed these pain points and gave us access to scalable compute, tighter security boundaries, and a rich ecosystem of managed services. Paired with Terraform and GitHub Actions, the migration was automated, version-controlled, and safe for production workflows.
📐 Why Terraform Was Our Tool of Choice
Terraform helped us manage infrastructure with:
- Consistency across
dev/prod
environments - Transparency through version-controlled
.tf
files - Safety via
terraform plan
- Modularity, enabling code reuse and isolated testing
The setup allowed our infrastructure to evolve alongside the app codebase, while minimizing risk.
🧱 Provisioning Infrastructure with Terraform
🐳 Provisioning ECR for Image Storage
We used this module to create a private, secure, and policy-driven ECR repository:
resource "aws_ecr_repository" "this" {
name = var.repository_name
image_tag_mutability = var.repository_image_tag_mutability
encryption_configuration {
encryption_type = var.repository_encryption_type
kms_key = var.repository_kms_key
}
image_scanning_configuration {
scan_on_push = var.repository_image_scan_on_push
}
force_delete = var.repository_force_delete
tags = var.tags
}
🔎 Highlights:
- Conditional creation for flexible environment-specific provisioning.
- KMS encryption and image scanning provide additional security.
- All values are driven by variables (
var.*
) to keep logic reusable.
🚀 ECS Cluster and Fargate Services
We deployed our backend as a containerized service on AWS Fargate, using the official ECS Terraform modules.
ECS Cluster Definition
module "ecs_cluster" {
source = "terraform-aws-modules/ecs/aws//modules/cluster"
cluster_name = "mrm-ecs-cluster-${local.environment}"
create_cloudwatch_log_group = false
}
This sets up the base cluster, scoped by environment (e.g., mrm-ecs-cluster-prd
).
ECS Service with Networking and Load Balancing
module "ecs_service" {
source = "terraform-aws-modules/ecs/aws//modules/service"
name = "mrm-ecs-service-${local.environment}"
cluster_arn = module.ecs_cluster.arn
cpu = 1024
memory = 2048
container_definitions = {
"mrm-api-container-${local.environment}" = {
image = "${module.ecr.repository_url}:${var.image_version}"
cpu = 1024
memory = 2048
memory_reservation = 1024
essential = true
environment = [
{ name = "DB_HOST", value = module.rds.db_instance_address },
{ name = "DB_PORT", value = module.rds.db_instance_port },
{ name = "DB_NAME", value = module.rds.db_instance_name },
{ name = "DB_USER", value = module.rds.db_instance_username },
{ name = "DB_PASSWORD", value = var.db_password },
{ name = "ALLOWED_HOSTS", value = module.alb.dns_name }
]
port_mappings = [{
name = "mrm-api"
containerPort = 8000
hostPort = 8000
protocol = "tcp"
}]
}
}
load_balancer = {
service = {
target_group_arn = module.alb.target_groups["mrm-alb-tg-${local.environment}"].arn
container_name = "mrm-api-container-${local.environment}"
container_port = 8000
}
}
subnet_ids = module.vpc.private_subnets
security_group_rules = {
"ingress-db" = {
type = "ingress"
from_port = 5432
to_port = 5432
protocol = "tcp"
cidr_blocks = [module.vpc.vpc_cidr_block]
}
"ingress-alb" = {
type = "ingress"
from_port = 8000
to_port = 8000
protocol = "tcp"
source_security_group_id = module.alb.security_group_id
}
"egress-all" = {
type = "egress"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
enable_execute_command = true
}
🔐 Notes:
- ECS tasks are deployed in private subnets with outbound-only internet access.
- Security groups tightly control ingress from ALB and internal DB access.
enable_execute_command = true
enables live shell access for debugging.
To expose the service, we connect it to an Application Load Balancer (ALB). This ensures traffic is routed properly and enables health checks.
load_balancer = {
service = {
target_group_arn = module.alb.target_groups["mrm-alb-tg-${local.environment}"].arn
container_name = "mrm-api-container-${local.environment}"
container_port = 8000
}
}
This config:
- Binds the ECS service to the ALB target group
- Routes incoming traffic to container port 8000
- Automatically registers/deregisters tasks based on ALB health checks
We run our ECS tasks in private subnets and apply strict security group rules to isolate traffic.
subnet_ids = module.vpc.private_subnets
security_group_rules = {
"ingress-db" = {
type = "ingress"
from_port = 5432
to_port = 5432
protocol = "tcp"
cidr_blocks = [module.vpc.vpc_cidr_block]
}
"ingress-alb" = {
type = "ingress"
from_port = 8000
to_port = 8000
protocol = "tcp"
source_security_group_id = module.alb.security_group_id
}
"egress-all" = {
type = "egress"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
We also use Amazon RDS and Lightsail to support backend deployments on AWS. You can learn more about how to provision these services using Terraform in the official Terraform AWS Provider documentation.
🌍 SPA Hosting via S3, CloudFront & Route 53
Hosting Bucket with OAC
module "s3_mrm_front" {
source = "terraform-aws-modules/s3-bucket/aws"
bucket = "mrm-s3-front-${var.environment}"
force_destroy = true
}
This block provisions an S3 bucket that hosts the static frontend assets (HTML, CSS, JS) of the SPA (Single-Page Application). Here’s what the config does:
- bucket: Dynamically names the bucket per environment (e.g.,
mrm-s3-front-dev, mrm-s3-front-prd
). force_destroy = true
: Allows Terraform to delete the bucket even if it contains objects—useful during infrastructure resets or environment teardowns.- The bucket is later connected to CloudFront using Origin Access Control (OAC), which restricts access so only CloudFront can fetch assets, ensuring public traffic cannot hit the bucket directly.
📌 Why this matters:
This provides a secure, scalable, and cost-effective way to serve frontend content globally.
CloudFront Function for SPA Routing
resource "aws_cloudfront_function" "spa_redirect_function" {
name = "mrm-cf-function-spa-redirect-${var.environment}"
runtime = "cloudfront-js-1.0"
code = <<EOF
function handler(event) {
var request = event.request;
if (request.uri.endsWith('/') || !request.uri.includes('.')) {
request.uri = '/index.html';
}
return request;
}
EOF
}
This block defines a CloudFront Function written in lightweight JavaScript to support proper routing behavior for SPAs.
🔍 What it does:
- Intercepts incoming requests to CloudFront.
- If the request path ends with / or doesn’t contain a file extension (i.e., looks like a route, not a file), it rewrites the URI to /index.html.
- This ensures that client-side routes (like
/dashboard
or/data-entry/fish
) are handled correctly by the SPA framework rather than resulting in a 404.
📌 Why this matters:
Static SPA frameworks (like React or Vue) rely on the browser to interpret routes.
Without this redirect, navigating directly to a non-root route would result in a “Not Found” error unless the server (CloudFront in this case) serves index.html
for unknown paths.
CloudFront Distribution
module "cdn" {
source = "terraform-aws-modules/cloudfront/aws"
origin = {
s3_mrm_front_oac = { domain_name = module.s3_mrm_front.s3_bucket_bucket_regional_domain_name, origin_access_control = "s3_oac" }
mrm_api = { domain_name = var.mrm_api_domain_name }
dropbox_api = { domain_name = "dl.dropboxusercontent.com" }
}
default_cache_behavior = {
target_origin_id = "s3_mrm_front_oac"
viewer_protocol_policy = "redirect-to-https"
function_association = {
viewer-request = { function_arn = aws_cloudfront_function.spa_redirect_function.arn }
}
}
ordered_cache_behavior = [
{ path_pattern = "/api/*", target_origin_id = "mrm_api" },
{ path_pattern = "/dropbox/*", target_origin_id = "dropbox_api" }
]
}
This configuration:
- Serves the frontend from S3
- Routes
/api/*
to our backend ECS service - Routes
/dropbox/*
to an external Dropbox API - Handles HTTPS enforcement and caching behavior
🚀 CI/CD Pipeline with GitHub Actions
We built a GitHub Actions workflow with separate jobs for:
1️⃣ Backend Deployment
- name: Build & Push Docker Image
run: |
docker build -t $ECR_URL:$GIT_SHA .
docker push $ECR_URL:$GIT_SHA
What it does:
- Builds the container using the current Git commit as a version tag (
$GIT_SHA
) for traceability. - Pushes the image to AWS Elastic Container Registry (ECR), scoped to the appropriate environment (dev, prd, etc.).
- OIDC-based authentication allows GitHub to assume a secure IAM role without storing long-lived AWS credentials.
🧠 Why this matters: It ensures every deployment is deterministic, reproducible, and versioned. If a bug is discovered, our team can redeploy any specific image by referring to its commit hash.
2️⃣ Terraform Infrastructure Apply
- name: Terraform Apply
run: |
terraform -chdir=infra/${{ env }} init
terraform -chdir=infra/${{ env }} apply -auto-approve
What it does:
- Initializes Terraform in a per-environment directory (e.g.,
infra/dev
). - Applies infrastructure changes automatically, using the latest variables (like image tag, DB password, etc.).
- Leverages the same OIDC-authenticated IAM role from the previous job to assume least-privilege permissions in AWS.
🧠 Why this matters:
This enforces infrastructure consistency and ensures that backend changes (e.g., new environment variables, resource adjustments) are always paired with infrastructure updates—without requiring manual intervention.
3️⃣ Frontend Build & Sync to S3
- run: yarn build:${{ env }}
- name: Sync to S3
uses: ./.github/actions/s3-sync-action
with:
s3-bucket: mrm-s3-front-${{ env }}
role-to-assume: arn:aws:iam::{id}:role/mrm-role-github-actions-read-write
source-directory: ./dist
What it does:
- Runs the environment-specific frontend build command (e.g.,
yarn build:prd
) to produce optimized static assets. - Syncs the contents of
./dist
to the corresponding S3 bucket (mrm-s3-front-prd, mrm-s3-front-dev, etc.
). - Uses a reusable GitHub Action (
s3-sync-action
) that assumes a write-permission IAM role securely.
🧠 Why this matters:
This ensures the frontend is always deployed alongside backend and infrastructure changes. By using per-environment buckets, we isolate changes safely and reduce the risk of environment cross-contamination.
🔐 Networking & Security Best Practices
- All backend components run in private subnets.
- ECS task IAM roles follow least-privilege principles.
- Security Groups tightly control traffic between services.
- TLS is terminated at the Application Load Balancer.
📈 Monitoring and Cost Visibility
- ECS services autoscale based on CPU utilization.
- CloudWatch Dashboards give insights into performance bottlenecks.
- Resources are tagged by environment, team, and service to improve visibility in AWS Cost Explorer.
✅ Results
- Stable deploys via GitHub Actions and Terraform.
- Infrastructure parity across dev, staging, and production.
- Operational visibility through metrics and alerts.
- Fewer manual changes.
🧭 Final Thoughts
This migration to AWS has fundamentally strengthened the technical foundation of the Micronesia Reef Monitoring (MRM) platform. By combining modular Terraform infrastructure, secure managed services, and automated deployments, we’ve achieved:
- Reliable, versioned container delivery for the backend
- Reproducible infrastructure across
dev
andprod
- Scalable database operations without manual patching
- Global frontend delivery with CloudFront and S3
- Streamlined deployment via GitHub Actions
What began as a small-scale data portal is now a modern, cloud-native platform that can scale with the needs of the region’s environmental monitoring programs.
Ready to transform your systems with intention? Let’s build what’s next.