This API provides a service for removing the background from an image based on a specified bounding box. It uses the U2-Net pre-trained model for background removal and integrates with AWS S3 for storing the processed images.
- Overview
- Features
- API Documentation
- Local Deployment
- AWS Configuration
- Docker & ECR Setup
- Connecting Lambda to ECR
- Testing
- Application
- Tools, Frameworks, and Libraries Used
The Background Removal API is a RESTful service that processes images by removing their background within a specified bounding box. The output is a transparent PNG image hosted on AWS S3 with a pre-signed URL for download.
- Image Background Removal: Uses U2-Net for precise object segmentation.
- Bounding Box Support: Processes only the specified region of the image.
- AWS S3 Integration: Stores processed images securely and provides pre-signed URLs.
- Easy Deployment: Supports both local and Docker-based deployment.
- Scalable: Designed to be hosted on AWS Lambda with API Gateway.
- URL:
/process-image - Method:
POST - Content-Type:
application/json
The API expects the following JSON payload:
{
"image_url": "<public_image_url>",
"bounding_box": {
"x_min": <integer>, // Top-left x-coordinate
"y_min": <integer>, // Top-left y-coordinate
"x_max": <integer>, // Bottom-right x-coordinate
"y_max": <integer> // Bottom-right y-coordinate
}
}Success Response (HTTP 200):
{
"original_image_url": "<original_image_url>",
"processed_image_url": "<background_removed_image_url>"
}Error Response (HTTP 400):
{
"error": "<error_message>"
}Request:
curl -X POST http://localhost:5000/process-image \
-H "Content-Type: application/json" \
-d '{
"image_url": "https://example.com/sample-image.jpg",
"bounding_box": {
"x_min": 50,
"y_min": 50,
"x_max": 400,
"y_max": 400
}
}Success Response:
{
"original_image_url": "https://example.com/sample-image.jpg",
"processed_image_url": "https://<bucket-name>.s3.<region>.amazonaws.com/processed-images/1670892210.png"
}Error Response:
{
"error": "Invalid image URL."
}- Python: Version 3.8 or higher
- Pip: Python package manager
- AWS Account: For storing processed images in S3, using ECR and Lamda
- Docker: For containerized execution
git clone <repository_url>
cd <repository_name>python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txt1. Download u2net.pth from U2-Net Official Repository.
2. Place it in the models/ directoryor
cd path/to/folder/models/
wget https://huggingface.co/lilpotat/pytorch3d/resolve/346374a95673795896e94398d65700cb19199e31/u2net.pth- create .env file in project root directory
AWS_ACCESS_KEY_ID=<your-aws-access-key-id>
AWS_SECRET_ACCESS_KEY=<your-aws-secret-access-key>
S3_BUCKET_NAME=<your-s3-bucket-name>
AWS_REGION=<your-aws-region>- Log in to the AWS Management Console.
- Create an S3 bucket (e.g., background-removal).
- Configure bucket permissions to allow Lambda to upload objects:
- Attach the following bucket policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "<lambda-role-arn>"
},
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::background-removal/*"
}
]
}(come back to this policy once we have set up out lambda instance)
- Install the AWS CLI:
pip install awscli
Configure AWS credentials:aws configure- Provide your AWS Access Key ID and Secret Access Key.
- Set the default region (e.g., us-east-1).
- Create New Repository in ECR Console
- Provide a name and click create.
- Once Created, copy its URI. (eg .dkr.ecr..amazonaws.com/)
- Create New Repository in ECR Console
- Provide a name and click create.
- Once Created, copy its URI. (eg .dkr.ecr..amazonaws.com/)
1. Authenticate Docker with ECR
aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <account-id>.dkr.ecr.<region>.amazonaws.com2. Build Docker Image
docker build -t <project>.3.Tag the Imge
docker tag <project>:latest <account-id>.dkr.ecr.<region>.amazonaws.com/background-removal-api:latest4. Push the Image ECR
docker push <account-id>.dkr.ecr.<region>.amazonaws.com/<project>:latest5. Verify Image in ECR
- AWS Console > ECR > Repository >
- Ensure the image appears in the list
- Create New Lambda Func:
- Go to AWS Lambda > Click Create Function.
- Choose Container Image as the deployment method.
- Provide a name (e.g., background-removal-function).
- Select Permissions:
- AWSLambdaBasicExecutionRole
- AmazonS3FullAccess (for S3 interaction)
- Click Create Function
- Choose the Container Image:
- In the Lambda function details page, click Upload from ECR.
- Select your ECR repository and the appropriate image tag (latest).
- Click Save.
- Configure Environment Variables:
- Add the required environment variables under the Configuration > Environment Variables section
AWS_ACCESS_KEY_ID=<your-aws-access-key-id>
AWS_SECRET_ACCESS_KEY=<your-aws-secret-access-key>
S3_BUCKET_NAME=<your-s3-bucket-name>
AWS_REGION=<your-aws-region>- Increase Timeout:
- Under General Configuration, increase the timeout to 30 seconds.
- Set up an API Gateway to trigger the Lambda function.
- Create a REST API with the /process-image endpoint.
- Link it to the Lambda function.
- Use curl or Postman to send a request to the /process-image endpoint:
curl -X POST https://<api-gateway-url>/process-image \
-H "Content-Type: application/json" \
-d '{
"image_url": "https://example.com/sample-image.jpg",
"bounding_box": {
"x_min": 50,
"y_min": 50,
"x_max": 400,
"y_max": 400
}
}'If there are any errors, view the logs in AWS CloudWatch for debugging.
- Considering you have setup the code and all AWS configurations
docker build -t <project> .- Connecting and running on local host
docker run -p 5000:5000 -e AWS_ACCESS_KEY_ID="<AWS_ACCESS_KEY_ID>" -e AWS_SECRET_ACCESS_KEY="AWS_SECRET_ACCESS_KEY" -e AWS_DEFAULT_REGION="us-east-1" <PROJECT>- Test curl
curl -X POST http://localhost:5000/process-image
-H "Content-Type: application/json"
-d '{
"image_url": "https://image-processor-meta.s3.us-east-1.amazonaws.com/3.png",
"bounding_box": {
"x_min": 108,
"y_min": 108,
"x_max": 972,
"y_max": 972
}
}'- Response
{
"original_image_url": "<original_image_url>",
"processed_image_url": "<background_removed_image_url>"
}As of this last commit the amazon endpoint, results in timeout. but once the code is set, aws is configured and docker is running. we can see that the results/respones arrive in less than 50ms. @ http://localhost:5000/process-image
- https://image-processor-meta.s3.us-east-1.amazonaws.com/1.png
- https://image-processor-meta.s3.us-east-1.amazonaws.com/2.png
- https://image-processor-meta.s3.us-east-1.amazonaws.com/3.png
- https://image-processor-meta.s3.us-east-1.amazonaws.com/4.png
- Python: The programming language used for developing the API due to its simplicity and extensive library support.
- Flask:
- A lightweight Python web framework used to create the RESTful API.
- Handles HTTP requests and routing.
- U2-Net:
- A state-of-the-art pre-trained deep learning model for background removal and object segmentation.
- Provides high accuracy for separating the foreground from the background.
- Official repository: U2-Net on GitHub.
- PyTorch:
- A popular deep learning framework used to load and execute the U2-Net model.
- Provides flexibility for neural network operations and efficient GPU acceleration.
- Used for:
- Loading pre-trained model weights (
u2net.pth). - Running inference on the input image to generate segmentation masks.
- Loading pre-trained model weights (
- Pillow:
- A Python Imaging Library (PIL) fork used for image manipulation tasks.
- Capabilities include:
- Cropping the image based on the bounding box.
- Resizing the image for model input.
- Applying alpha transparency to the processed image.
- Requests:
- A Python library used for making HTTP requests.
- Key use cases in the project:
- Downloading images from the provided URLs.
- Handling retries for robust communication with external servers.
- NumPy:
- A library for numerical operations and array manipulation.
- Used for:
- Converting the U2-Net model's output into binary segmentation masks.
- Applying dynamic thresholding to improve mask quality.
- Boto3:
- The official AWS SDK for Python.
- Enables interaction with AWS services like S3.
- Key functionalities in the project:
- Uploading processed images to an S3 bucket.
- Generating pre-signed URLs for accessing images.
- Docker:
- A platform used to containerize the application for consistent deployment across environments.
- Benefits:
- Isolates dependencies for the API.
- Simplifies deployment to production or testing environments.
- Contains the application, Python runtime, and all necessary libraries in a single image.