Tag: agentic ai

  • Building a Cost-Effective Agentic AI Sandbox on AWS EC2 (Under $3/Month)

    Building “Agentic AI” applications usually comes with a steep price tag. Between Managed Vector Databases (like OpenSearch Serverless) and always-on containers, a simple experiment can easily cost $200+ per month.

    But it doesn’t have to.

    In this guide, I will share how to architect a robust, persistent AI sandbox on AWS EC2 that costs roughly $2.00 – $3.00 per month. We will cover the infrastructure setup, how to prevent crashes on small servers, and why—for this specific use case—EC2 beats Serverless.


    1. The Hardware Choice: T3 Instances

    To keep costs low, we look at the t3 family. These are burstable instances perfect for development workloads.

    InstanceSpecsMonthly Cost (24/7)Monthly Cost (1 hr/day)Verdict
    t3.micro2 vCPU, 1GB RAM~$8.60~$0.36Too weak. Will crash Docker/Python.
    t3.small2 vCPU, 2GB RAM~$17.20~$0.72Risky. Requires heavy Swap usage.
    t3.medium2 vCPU, 4GB RAM~$34.40~$1.43Perfect. Stable for AI/Docker work.

    The Strategy: We use a t3.medium, but we only run it when we are working. By using AWS EventBridge to auto-stop the instance, or manually stopping it, we pay pennies for compute while keeping our data safe.


    2. The “One-Click” Launch Script

    Manually clicking through the AWS Console is error-prone. Instead, I created a robust Bash script that handles:

    1. Idempotency: It checks if the server is already running so you don’t accidentally launch duplicates.
    2. Permissions: It automatically “Supercharges” the IAM Role to allow CDK deployments.
    3. Networking: It opens ports 22 (SSH) and 8080 (Web apps) automatically.

    Save this as launch_ec2.sh on your local machine:

    Bash

    #!/bin/bash
    set -e
    
    # --- CONFIGURATION ---
    REGION="eu-west-2" # London
    ROLE_ARN="arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_SAGEMAKER_ROLE"
    INSTANCE_TYPE="t3.medium"
    KEY_NAME="my-ec2-key"      
    TAG_NAME="My-AI-Sandbox"
    
    echo "🚀 Starting Launch in $REGION..."
    
    # 1. CHECK FOR EXISTING SERVER
    EXISTING_ID=$(aws ec2 describe-instances \
        --filters "Name=tag:Name,Values=$TAG_NAME" "Name=instance-state-name,Values=running" \
        --region $REGION --query "Reservations[0].Instances[0].InstanceId" --output text)
    
    if [ "$EXISTING_ID" != "None" ]; then
        PUBLIC_IP=$(aws ec2 describe-instances --instance-ids $EXISTING_ID --region $REGION --query 'Reservations[0].Instances[0].PublicIpAddress' --output text)
        echo "✅ Instance already running ($EXISTING_ID) at $PUBLIC_IP"
        exit 0
    fi
    
    # 2. LAUNCH INSTANCE
    # (Assuming Role and Security Group logic is handled within your specific setup)
    AMI_ID=$(aws ssm get-parameters --names /aws/service/ami-amazon-linux-latest/al2023-ami-kernel-default-x86_64 \
        --region $REGION --query "Parameters[0].Value" --output text)
    
    echo "🚀 Launching $INSTANCE_TYPE..."
    aws ec2 run-instances \
        --image-id "$AMI_ID" \
        --count 1 \
        --instance-type "$INSTANCE_TYPE" \
        --key-name "$KEY_NAME" \
        --iam-instance-profile Name="Your-Profile-Name" \
        --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=$TAG_NAME}]" \
        --region $REGION
    

    3. The Secret Weapon: Swap Space

    When you install tools like AWS CDK or PyTorch on a server with 2GB or 4GB or RAM, the installation process can consume all available memory, causing the server to freeze (“Out of Memory” kill).

    To fix this, we turn a portion of the Hard Drive into “Fake RAM” (Swap). Run this immediately after logging into your new server:

    Bash

    # Create a 4GB Swap File
    sudo dd if=/dev/zero of=/swapfile bs=128M count=32
    sudo chmod 600 /swapfile
    sudo mkswap /swapfile
    sudo swapon /swapfile
    
    # Make it permanent
    echo "/swapfile swap swap defaults 0 0" | sudo tee -a /etc/fstab
    

    4. Environment Setup: Podman & Virtual Envs

    Amazon Linux 2023 does not support Docker natively in some configurations, and installing Python packages globally can break the OS (PEP 668 error).

    We solve this using Podman (via Homebrew) and Python Virtual Environments.

    The Setup Script (setup_env.sh):

    Bash

    #!/bin/bash
    # 1. Install Homebrew & Podman
    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
    brew install podman
    
    # 2. Configure Podman to look like Docker (for AWS CDK)
    mkdir -p ~/.config/systemd/user
    ln -sf $(brew --prefix podman)/lib/systemd/user/podman.socket ~/.config/systemd/user/podman.socket
    systemctl --user enable --now podman.socket
    echo 'export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/podman/podman.sock' >> ~/.bashrc
    echo 'alias docker=podman' >> ~/.bashrc
    
    # 3. Setup Python Virtual Environment
    python3 -m venv ~/ai_env
    source ~/ai_env/bin/activate
    pip install boto3 langchain chromadb
    

    5. Storage Management: The “No Space Left” Error

    A t3.medium typically comes with only 8GB of storage. Installing the GPU version of PyTorch (~2.5GB) can fill the disk instantly.

    How to reclaim space:

    1. Use CPU-only PyTorch:Bashpip install torch --index-url https://download.pytorch.org/whl/cpu --no-cache-dir
    2. Clean Caches:Bashrm -rf ~/.cache/pip brew cleanup --prune=all
    3. Resize Volume: If 8GB isn’t enough, modify the volume in AWS Console to 16GB (adds ~$0.64/month) and run sudo xfs_growfs /.

    6. Architecture Decision: EC2 vs. Fargate

    For this project, I chose EC2 over AWS Fargate.

    • The Use Case: An AI Agent using ChromaDB (a local vector database).
    • The Fargate Problem: Fargate is ephemeral. When the container stops, the file system is wiped. Your AI forgets everything it learned.
    • The EC2 Solution: EC2 behaves like a rented house. You can “Stop” the instance (saving compute costs), but the Hard Drive (EBS) persists. Your Vector DB remains intact for the next run.

    Conclusion

    By combining Spot/On-Demand EC2, Swap Space, and Local Vector Stores, we moved from a potential $200/month cloud bill to a manageable **$2.50/month** sandbox. This setup allows for powerful experimentation with Agentic AI without the fear of cascading cloud costs.