Migrating from Digital Ocean to AWS: A Simple Guide for Research Projects

Judy
June 18, 2025
9 mins read
Migrating from Digital Ocean to AWS: A Simple Guide for Research Projects

🧭 Migrating from Digital Ocean to AWS: A Simple Guide for Research Projects

Client Sector: Marine Conservation & Research

Client: University of Guam (UOG) Marine Lab and Micronesia Coral Reef Monitoring Network

Service Type: Full-Stack System Modernization

Technologies Used: React, Django, R Shiny, Angular, Python, Pandas, NumPy, SciPy


🌐 Project Overview

To support coral reef conservation across Micronesia, we modernized a legacy data platform used by marine researchers and regional policy makers. The system aggregates data from field surveys and oceanographic sensors into a unified web interface. Our goal was to create a cloud-native, modular, and maintainable architecture that scales with growing data needs and offers operational resilience.


🌊 Why We Migrated from Digital Ocean

Digital Ocean served well during early development but posed growing limitations:

  • Lack of fully managed services for databases and load balancing
  • Limited options for fine-grained IAM and networking controls
  • Difficulty in orchestrating multi-service deployments
  • Manual setup hampering reproducibility

AWS addressed these pain points and gave us access to scalable compute, tighter security boundaries, and a rich ecosystem of managed services. Paired with Terraform and GitHub Actions, the migration was automated, version-controlled, and safe for production workflows.


📐 Why Terraform Was Our Tool of Choice

Terraform helped us manage infrastructure with:

  • Consistency across dev/prod environments
  • Transparency through version-controlled .tf files
  • Safety via terraform plan
  • Modularity, enabling code reuse and isolated testing

The setup allowed our infrastructure to evolve alongside the app codebase, while minimizing risk.


🧱 Provisioning Infrastructure with Terraform

🐳 Provisioning ECR for Image Storage

We used this module to create a private, secure, and policy-driven ECR repository:

resource "aws_ecr_repository" "this" {
  name                  = var.repository_name
  image_tag_mutability  = var.repository_image_tag_mutability

  encryption_configuration {
    encryption_type = var.repository_encryption_type
    kms_key         = var.repository_kms_key
  }

  image_scanning_configuration {
    scan_on_push = var.repository_image_scan_on_push
  }

  force_delete = var.repository_force_delete
  tags         = var.tags
}

🔎 Highlights:

  • Conditional creation for flexible environment-specific provisioning.
  • KMS encryption and image scanning provide additional security.
  • All values are driven by variables (var.*) to keep logic reusable.

🚀 ECS Cluster and Fargate Services

We deployed our backend as a containerized service on AWS Fargate, using the official ECS Terraform modules.

ECS Cluster Definition

module "ecs_cluster" {
  source                       = "terraform-aws-modules/ecs/aws//modules/cluster"
  cluster_name                 = "mrm-ecs-cluster-${local.environment}"
  create_cloudwatch_log_group = false
}

This sets up the base cluster, scoped by environment (e.g., mrm-ecs-cluster-prd).

ECS Service with Networking and Load Balancing

module "ecs_service" {
  source      = "terraform-aws-modules/ecs/aws//modules/service"
  name        = "mrm-ecs-service-${local.environment}"
  cluster_arn = module.ecs_cluster.arn

  cpu    = 1024
  memory = 2048

  container_definitions = {
    "mrm-api-container-${local.environment}" = {
      image              = "${module.ecr.repository_url}:${var.image_version}"
      cpu                = 1024
      memory             = 2048
      memory_reservation = 1024
      essential          = true

      environment = [
        { name = "DB_HOST",     value = module.rds.db_instance_address },
        { name = "DB_PORT",     value = module.rds.db_instance_port },
        { name = "DB_NAME",     value = module.rds.db_instance_name },
        { name = "DB_USER",     value = module.rds.db_instance_username },
        { name = "DB_PASSWORD", value = var.db_password },
        { name = "ALLOWED_HOSTS", value = module.alb.dns_name }
      ]

      port_mappings = [{
        name          = "mrm-api"
        containerPort = 8000
        hostPort      = 8000
        protocol      = "tcp"
      }]
    }
  }

  load_balancer = {
    service = {
      target_group_arn = module.alb.target_groups["mrm-alb-tg-${local.environment}"].arn
      container_name   = "mrm-api-container-${local.environment}"
      container_port   = 8000
    }
  }

  subnet_ids = module.vpc.private_subnets

  security_group_rules = {
    "ingress-db" = {
      type        = "ingress"
      from_port   = 5432
      to_port     = 5432
      protocol    = "tcp"
      cidr_blocks = [module.vpc.vpc_cidr_block]
    }
    "ingress-alb" = {
      type                     = "ingress"
      from_port                = 8000
      to_port                  = 8000
      protocol                 = "tcp"
      source_security_group_id = module.alb.security_group_id
    }
    "egress-all" = {
      type        = "egress"
      from_port   = 0
      to_port     = 0
      protocol    = "-1"
      cidr_blocks = ["0.0.0.0/0"]
    }
  }

  enable_execute_command = true
}

🔐 Notes:

  • ECS tasks are deployed in private subnets with outbound-only internet access.
  • Security groups tightly control ingress from ALB and internal DB access.
  • enable_execute_command = true enables live shell access for debugging.

To expose the service, we connect it to an Application Load Balancer (ALB). This ensures traffic is routed properly and enables health checks.

  load_balancer = {
    service = {
      target_group_arn = module.alb.target_groups["mrm-alb-tg-${local.environment}"].arn
      container_name   = "mrm-api-container-${local.environment}"
      container_port   = 8000
    }
  }

This config:

  • Binds the ECS service to the ALB target group
  • Routes incoming traffic to container port 8000
  • Automatically registers/deregisters tasks based on ALB health checks

We run our ECS tasks in private subnets and apply strict security group rules to isolate traffic.

  subnet_ids = module.vpc.private_subnets

  security_group_rules = {
    "ingress-db" = {
      type        = "ingress"
      from_port   = 5432
      to_port     = 5432
      protocol    = "tcp"
      cidr_blocks = [module.vpc.vpc_cidr_block]
    }
    "ingress-alb" = {
      type                     = "ingress"
      from_port                = 8000
      to_port                  = 8000
      protocol                 = "tcp"
      source_security_group_id = module.alb.security_group_id
    }
    "egress-all" = {
      type        = "egress"
      from_port   = 0
      to_port     = 0
      protocol    = "-1"
      cidr_blocks = ["0.0.0.0/0"]
    }
  }

We also use Amazon RDS and Lightsail to support backend deployments on AWS. You can learn more about how to provision these services using Terraform in the official Terraform AWS Provider documentation.

🌍 SPA Hosting via S3, CloudFront & Route 53

Hosting Bucket with OAC

module "s3_mrm_front" {
  source  = "terraform-aws-modules/s3-bucket/aws"
  bucket  = "mrm-s3-front-${var.environment}"
  force_destroy = true
}

This block provisions an S3 bucket that hosts the static frontend assets (HTML, CSS, JS) of the SPA (Single-Page Application). Here’s what the config does:

  • bucket: Dynamically names the bucket per environment (e.g., mrm-s3-front-dev, mrm-s3-front-prd).
  • force_destroy = true: Allows Terraform to delete the bucket even if it contains objects—useful during infrastructure resets or environment teardowns.
  • The bucket is later connected to CloudFront using Origin Access Control (OAC), which restricts access so only CloudFront can fetch assets, ensuring public traffic cannot hit the bucket directly.

📌 Why this matters:

This provides a secure, scalable, and cost-effective way to serve frontend content globally.

CloudFront Function for SPA Routing

resource "aws_cloudfront_function" "spa_redirect_function" {
  name    = "mrm-cf-function-spa-redirect-${var.environment}"
  runtime = "cloudfront-js-1.0"
  code    = <<EOF
    function handler(event) {
      var request = event.request;
      if (request.uri.endsWith('/') || !request.uri.includes('.')) {
        request.uri = '/index.html';
      }
      return request;
    }
  EOF
}

This block defines a CloudFront Function written in lightweight JavaScript to support proper routing behavior for SPAs.

🔍 What it does:

  • Intercepts incoming requests to CloudFront.
  • If the request path ends with / or doesn’t contain a file extension (i.e., looks like a route, not a file), it rewrites the URI to /index.html.
  • This ensures that client-side routes (like /dashboard or /data-entry/fish) are handled correctly by the SPA framework rather than resulting in a 404.

📌 Why this matters: Static SPA frameworks (like React or Vue) rely on the browser to interpret routes. Without this redirect, navigating directly to a non-root route would result in a “Not Found” error unless the server (CloudFront in this case) serves index.html for unknown paths.

CloudFront Distribution

module "cdn" {
  source = "terraform-aws-modules/cloudfront/aws"

  origin = {
    s3_mrm_front_oac = { domain_name = module.s3_mrm_front.s3_bucket_bucket_regional_domain_name, origin_access_control = "s3_oac" }
    mrm_api          = { domain_name = var.mrm_api_domain_name }
    dropbox_api      = { domain_name = "dl.dropboxusercontent.com" }
  }

  default_cache_behavior = {
    target_origin_id       = "s3_mrm_front_oac"
    viewer_protocol_policy = "redirect-to-https"
    function_association = {
      viewer-request = { function_arn = aws_cloudfront_function.spa_redirect_function.arn }
    }
  }

  ordered_cache_behavior = [
    { path_pattern = "/api/*",     target_origin_id = "mrm_api" },
    { path_pattern = "/dropbox/*", target_origin_id = "dropbox_api" }
  ]
}

This configuration:

  • Serves the frontend from S3
  • Routes /api/* to our backend ECS service
  • Routes /dropbox/* to an external Dropbox API
  • Handles HTTPS enforcement and caching behavior

🚀 CI/CD Pipeline with GitHub Actions

We built a GitHub Actions workflow with separate jobs for:

1️⃣ Backend Deployment

- name: Build & Push Docker Image
  run: |
    docker build -t $ECR_URL:$GIT_SHA .
    docker push $ECR_URL:$GIT_SHA

What it does:

  • Builds the container using the current Git commit as a version tag ($GIT_SHA) for traceability.
  • Pushes the image to AWS Elastic Container Registry (ECR), scoped to the appropriate environment (dev, prd, etc.).
  • OIDC-based authentication allows GitHub to assume a secure IAM role without storing long-lived AWS credentials.

🧠 Why this matters: It ensures every deployment is deterministic, reproducible, and versioned. If a bug is discovered, our team can redeploy any specific image by referring to its commit hash.

2️⃣ Terraform Infrastructure Apply

- name: Terraform Apply
  run: |
    terraform -chdir=infra/${{ env }} init
    terraform -chdir=infra/${{ env }} apply -auto-approve

What it does:

  • Initializes Terraform in a per-environment directory (e.g., infra/dev).
  • Applies infrastructure changes automatically, using the latest variables (like image tag, DB password, etc.).
  • Leverages the same OIDC-authenticated IAM role from the previous job to assume least-privilege permissions in AWS.

🧠 Why this matters:

This enforces infrastructure consistency and ensures that backend changes (e.g., new environment variables, resource adjustments) are always paired with infrastructure updates—without requiring manual intervention.

3️⃣ Frontend Build & Sync to S3

- run: yarn build:${{ env }}
- name: Sync to S3
  uses: ./.github/actions/s3-sync-action
  with:
    s3-bucket: mrm-s3-front-${{ env }}
    role-to-assume: arn:aws:iam::{id}:role/mrm-role-github-actions-read-write
    source-directory: ./dist

What it does:

  • Runs the environment-specific frontend build command (e.g., yarn build:prd) to produce optimized static assets.
  • Syncs the contents of ./dist to the corresponding S3 bucket (mrm-s3-front-prd, mrm-s3-front-dev, etc.).
  • Uses a reusable GitHub Action (s3-sync-action) that assumes a write-permission IAM role securely.

🧠 Why this matters:

This ensures the frontend is always deployed alongside backend and infrastructure changes. By using per-environment buckets, we isolate changes safely and reduce the risk of environment cross-contamination.


🔐 Networking & Security Best Practices

  • All backend components run in private subnets.
  • ECS task IAM roles follow least-privilege principles.
  • Security Groups tightly control traffic between services.
  • TLS is terminated at the Application Load Balancer.

📈 Monitoring and Cost Visibility

  • ECS services autoscale based on CPU utilization.
  • CloudWatch Dashboards give insights into performance bottlenecks.
  • Resources are tagged by environment, team, and service to improve visibility in AWS Cost Explorer.

✅ Results

  • Stable deploys via GitHub Actions and Terraform.
  • Infrastructure parity across dev, staging, and production.
  • Operational visibility through metrics and alerts.
  • Fewer manual changes.

🧭 Final Thoughts

This migration to AWS has fundamentally strengthened the technical foundation of the Micronesia Reef Monitoring (MRM) platform. By combining modular Terraform infrastructure, secure managed services, and automated deployments, we’ve achieved:

  • Reliable, versioned container delivery for the backend
  • Reproducible infrastructure across dev and prod
  • Scalable database operations without manual patching
  • Global frontend delivery with CloudFront and S3
  • Streamlined deployment via GitHub Actions

What began as a small-scale data portal is now a modern, cloud-native platform that can scale with the needs of the region’s environmental monitoring programs.

Ready to transform your systems with intention? Let’s build what’s next.

📚 Explore more MRM case studies