Enable the Hardware Acceleration of Chrome Browser on AWS Batch

In today’s rapidly evolving tech landscape, optimizing performance is a critical factor, especially when dealing with large-scale processing on cloud platforms. Many of you might find yourselves needing to automate Chrome on a server for various reasons. Additionally, you might want to navigate OpenGL-intensive websites using GPU acceleration. I was one of those people. However, if your tech stack isn’t perfectly aligned, this can be a daunting task.

In my quest to solve this issue, I scoured various QA sites, but I noticed that many users facing similar problems never received a conclusive answer. Therefore, I’ve written this article to address the gap and provide a comprehensive guide to enable hardware acceleration for Chrome on AWS Batch.

Steps to Enable Hardware Acceleration

Create a Docker Image with Chrome and Nvidia Support

First, you’ll need a Docker image that includes Google Chrome and the necessary Nvidia drivers. Here is a basic Dockerfile to get you started:

FROM --platform=linux/amd64 nvidia/cuda:12.5.0-devel-ubuntu22.04

WORKDIR /app

ENV DEBIAN_FRONTEND=noninteractive
ENV TZ=Asia/Tokyo

# For timezone
RUN apt-get update && \
    apt-get install -y software-properties-common tzdata && \
    ln -fs /usr/share/zoneinfo/Asia/Tokyo /etc/localtime && \
    dpkg-reconfigure --frontend noninteractive tzdata && \
    apt-get clean

# ldd chrome | grep found
#    libnss3.so => not found
#    libnssutil3.so => not found
#    libsmime3.so => not found
#    libnspr4.so => not found
#    libatk-1.0.so.0 => not found
#    libatk-bridge-2.0.so.0 => not found
#    libcups.so.2 => not found
#    libdrm.so.2 => not found
#    libxcb.so.1 => not found
#    libxkbcommon.so.0 => not found
#    libatspi.so.0 => not found
#    libX11.so.6 => not found
#    libXcomposite.so.1 => not found
#    libXdamage.so.1 => not found
#    libXext.so.6 => not found
#    libXfixes.so.3 => not found
#    libXrandr.so.2 => not found
#    libgbm.so.1 => not found
#    libpango-1.0.so.0 => not found
#    libcairo.so.2 => not found
#    libasound.so.2 => not found

RUN apt-get update && apt-get install -y \
    libnss3 \
    libnss3-tools \
    libnspr4 \
    libatk1.0-0 \
    libatk-bridge2.0-0 \
    libcups2 \
    libdrm2 \
    libxcb1 \
    libxkbcommon0 \
    libatspi2.0-0 \
    libx11-6 \
    libxcomposite1 \
    libxdamage1 \
    libxext6 \
    libxfixes3 \
    libxrandr2 \
    libgbm1 \
    libpango-1.0-0 \
    libcairo2 \
    libasound2 && \
    apt-get clean

# [ERROR:egl_util.cc(44)] : Failed to load GLES library: libGLESv2.so.2: libGLESv2.so.2: cannot open shared object file: No such file or directory
# [ERROR:egl_util.cc(52)] : Failed to load EGL library: libEGL.so.1: libEGL.so.1: cannot open shared object file: No such file or directory
RUN apt-get update && apt-get install -y libgles2-mesa libegl1-mesa && apt-get clean

# [ERROR:gl_display.cc(520)] : EGL Driver message (Critical) eglInitialize: xcb_connect failed
RUN apt-get update && apt-get install -y xvfb && apt-get clean

# xvfb-run --server-args="-screen 0 1920x1080x24 +extension GLX +render -noreset" \
# chrome \
# --enable-logging --v=1
# --headless \
# --ignore-gpu-blocklist \
# --enable-gpu-rasterization \
# --enable-zero-copy \
# --use-angle=default \
# --no-sandbox
RUN apt-get update && apt-get install -y \
    libva2 \
    libva-x11-2 \
    libva-drm2 && \
    apt-get clean

To get started, you’ll need a Docker image that includes the necessary Nvidia drivers. This Dockerfile will ensure Chrome runs smoothly. In my setup, I used Python’s Pyppeteer, which downloads Chrome, so there’s no need to install Chrome via apt-get. Of course, Puppeteer also works, and if you prefer, you can directly install Chrome using apt-get.

Pay close attention to the comments within the Dockerfile. The comments explain the commands executed, the errors encountered, and the libraries installed to address those errors. As of July 2024, this Dockerfile is functional. However, if you encounter any errors in the future, refer to these comments to guide your troubleshooting and adjustments.

Prepare Your Project

In this step, we will show an example using Pyppeteer in Python. If you prefer using Puppeteer, feel free to substitute Python with Node.js as appropriate.

There are a few key points to keep in mind, but one of the most important is to run your Python or Node.js through xvfb-run. This is crucial not only to allow Chrome to operate in a headless mode but also to enable GPU usage.

Here is an overview of what you need to do:

1. Prepare a shell script for running Python or Node.js on X

xvfb-run --server-args="-screen 0 1920x1080x24 +extension GLX +render -noreset" \
  python3.12 /app/run.py "$@"

2. Run Chrome in a Python Script

import asyncio
import logging
from typing import TypeAlias
from pyppeteer.launcher import launch

_BrowserLaunchOptions: TypeAlias = dict[str, list[str] | bool]

def _get_launch_options() -> _BrowserLaunchOptions:
    args = [
        "--lang=ja-JP",
        "--no-sandbox",
        "--ignore-gpu-blocklist",
        "--enable-gpu-rasterization",
        "--enable-zero-copy",
        "--disable-gpu-process-crash-limit",
        "--use-angle=default",
    ]

    if logger.level == logging.DEBUG:
        args.extend(["--enable-logging", "--v=1"])

    launch_options: _BrowserLaunchOptions = {"args": args, "ignoreHTTPSErrors": True}

    if logger.level == logging.DEBUG:
        launch_options["dumpio"] = True

    logger.debug(f"launch_options={launch_options}")
    return launch_options


async def main() -> None:
    launch_options = _get_launch_options()
    browser = await launch(**launch_options)
    # do something you need
    await browser.close()
 

if __name__ == "__main__":
    asyncio.get_event_loop().run_until_complete(main())

This is a code snippet. In this code, we only create the browser object, but you will need to further automate Chrome operations as per your requirements. The key point here is the launch options. If you get these options wrong, Chrome will not start. Pay particular attention to the arguments passed to args. In DEBUG mode, configure Chrome to output more logs. This way, if any issues arise, you can usually resolve them. If you cannot read the logs yourself, you can always rely on ChatGPT for assistance.

If you want to quickly verify that everything is working correctly, you can navigate to chrome://gpu, capture a screenshot of the page, and upload the screenshot file to S3. This will help you confirm that GPU acceleration is enabled.

page = await browser.newPage()
await page.goto("chrome://gpu")
await page.screenshot({"path": "./screenshot.png"})

As a side note, there is always a possibility of encountering errors in every operation, such as opening a tab, navigating to a URL, or retrieving content. Therefore, I write my code to restart from the point of opening a tab if any single operation fails. This process is repeated until it succeeds, but after five failures, it gives up. When automating Chrome, you need to write your code defensively to this extent.

3. Update the Dockerfile to adjust the Python Project

Please add the following content to the Dockerfile we prepared earlier. I will omit the details on how to create the requirements.txt.

# Install python3.12
RUN add-apt-repository ppa:deadsnakes/ppa -y && \
    apt-get update && \
    apt-get install -y python3.12 python3.12-distutils && \
    apt-get install -y wget && \
    apt-get clean

# Install pip
RUN wget https://bootstrap.pypa.io/get-pip.py && \
    python3.12 ./get-pip.py

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

Set Up an Environment on AWS Batch for GPU-Accelerated Chrome

In this final step, we will set up an environment on AWS Batch that can utilize GPU-accelerated Chrome. For this example, we will use the g4dn instance type, which is optimized for GPU workloads.

Here is a snippet of the AWS CDK code to set up the necessary infrastructure:

export class BatchChromeStack extends cdk.Stack {
  constructor(scope: Construct, id: string, stage: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const vpc: ec2.IVpc = this.getVpc();
    const subnets = this.getSubnets(vpc);
    const securityGroup = this.getSecurityGroup(vpc);

    const instanceRole = this.createInstanceRoleAndProfile();

    const { repository, image_name } = this.createAndDeployDockerImage();

    const jobRole = this.createJobRole();
    this.addSSMPolicyToJobRole(jobRole, stage);

    const bucket = this.createS3Bucket();
    this.addS3PutObjectPolicyToRole(jobRole, bucket);

    const container = this.createContainer(image_name, jobRole, bucket, stage);

    repository.grantPull(container.executionRole);

    this.createEcsJobDefinition(container);
    const computeEnvironment = this.createComputeEnvironment(vpc, [securityGroup], subnets, instanceRole);
    this.createJobQueue(computeEnvironment);
  }

  // Omit many codes

  private createContainer(image_name: string, jobRole: iam.Role, bucket: s3.Bucket, stage: string): batch.EcsEc2ContainerDefinition {
    const logLevel = stage === 'production' ? 'INFO' : 'DEBUG';

    return new batch.EcsEc2ContainerDefinition(this, `${this.NAME_PREFIX}ContainerDefinition`, {
      image: ecs.ContainerImage.fromRegistry(image_name),
      // g4dn.xlarge
      gpu: 1,
      cpu: 4,
      memory: cdk.Size.gibibytes(15), // 16GB - 1GB for OS
      jobRole: jobRole,
      logging: new ecs.AwsLogDriver({ streamPrefix: this.KEBAB_NAME_PREFIX }),
      environment: {
        BUCKET_NAME: bucket.bucketName,
        LOG_LEVEL: logLevel,
      },
    });
  }
}

I have omitted quite a lot of details in this step. I did not include what you can search online or ask ChatGPT to get answers, as those are readily accessible resources.

The biggest issue I faced was that when I specified the memory size of 16GB, which is available on the g4dn.xlarge instance, in the EcsEc2ContainerDefinition, the instance failed to launch. However, it did launch with 15GB. I believe this is a quirk of AWS Batch.

If you need the full version of the CDK code or if something is not working correctly, feel free to reach out. If I feel inclined, I might provide more detailed information.