Skip to content

Technical Exercise Product Engineer - Adhitya Bagus Putra Erlangga #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

adhityabpe
Copy link

Create Pull Req

@adhityabpe adhityabpe changed the title Main Technical Exercise Product Engineer - Adhitya Bagus Putra Erlangga Mar 4, 2025
@adhityabpe
Copy link
Author

adhityabpe commented Mar 4, 2025

Benchmark Comparison

Before Optimization:
• Memory Usage: 626.00 MiB
• Execution Time: 674666 ms

image

After Optimization:
• Memory Usage: 24.00 MiB
• Execution Time: 881116 ms

image

After Additional Optimization (Caching):
• Memory Usage: 24.00 MiB
• Execution Time: 2130294 ms

image

After Adjusting Chunk Size:
• Memory Usage: 26.00 MiB
• Execution Time: 3442164 ms

image

Thinking Process and Optimization Steps

Step 1: Analyzing the Current Implementation
• Initially, the DropOutEnrollments command was loading all fields for enrollments that met the criteria, leading to high memory usage.
• Sequential processing of enrollments resulted in high memory consumption and longer processing time.
• The transaction was started but not committed unless an error occurred, leading to lack of atomicity and data consistency.

Step 2: Implementing Selective Data Loading
• Modified the code to load only the necessary fields (id, course_id, student_id) to reduce memory usage.

Step 3: Batch Processing with chunkById
• Changed the processing to use chunkById with a chunk size of 1000 to reduce memory usage and improve performance by processing smaller batches.

Step 4: Efficient Queries
• Combined multiple conditions into a single query to minimize database operations and improve query performance.

Step 5: Transactional Operations
• Wrapped updates in a database transaction to ensure atomicity and consistency, committing or rolling back as needed.

Step 6: Adjusting Chunk Size
• Adjusted the chunk size to 2500 to allow processing larger batches of enrollments at once, potentially reducing execution time.

Why These Optimizations Are Important:

• Selective Data Loading: Reduces memory usage by loading only necessary fields, minimizing memory overhead.
• Batch Processing with chunkById: Prevents memory exhaustion and allows for faster processing by managing memory usage effectively.
• Efficient Queries: Reduces the number of database calls, speeding up query execution.
• Transactional Operations: Ensures that all updates are applied together or not at all, maintaining data integrity and consistency.
• Adjusting Memory Limit and Chunk Size: Allows for processing larger batches of enrollments at once, reducing the number of chunks and iterations, leading to potentially shorter execution time.
How These Optimizations Were Achieved:
• Analyzed the current implementation to identify bottlenecks and inefficiencies.
• Implemented selective data loading, batch processing, and efficient queries to reduce memory usage and improve performance.
• Used database transactions to ensure atomicity and data consistency.
• Adjusted the memory limit and chunk size to balance memory usage and execution time.

I have chosen to use the first optimization without adding caching and adjusted the chunk size because it proved to be faster and used less memory.

@adhityabpe adhityabpe marked this pull request as ready for review March 4, 2025 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant