Skip to content

Create secondary indices based on table bean annotations (#3923) #4004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

breader124
Copy link
Contributor

Allow creation of secondary indices based on annotations put on beans

Motivation and Context

This changes solves the problem described in the #3923

Modifications

I've implemented the mechanism that inside the createTable() method of the DefaultDynamoDbTable class takes the tableSchema field and uses it to come up with the list of LSIs and GSIs in the table's schema. They are distinguished based on the fact that primary partition key of the table should be the same attribute as the partition key of the LSI. If that's the true, then index is being mapped to LSI. Otherwise, it's treated as GSI and and will be mapped accordingly.

During the mapping there are no information about projections and provisioned throughput available (only applicable to GSIs). I believe that it should be the scope of another bunch of work to make it possible on the bean level to specify projections and desired provisioned throughput and that's why I tried to come up with some sensible defaults for now:

  • for projections I used ProjectionType.ALL, it may incur some additional cost for people that wouldn't need it, but after all it'll allow to use the index for all imaginable cases
  • for provisioned throughput I used the defaults of 20 RCUs and 20 WCUs

Testing

I've implemented additional tests in DefaultDynamoDbTableTest class to make sure that bugfix actually works. All tests executed while running ./mvnw package passed.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document
  • Local run of mvn install succeeds
  • My code follows the code style of this project
  • My change requires a change to the Javadoc documentation
  • I have updated the Javadoc documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have added a changelog entry. Adding a new entry must be accomplished by running the scripts/new-change script and following the instructions. Commit the new file created by the script in .changes/next-release with your changes.
  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • I confirm that this pull request can be released under the Apache 2 license

* detect and group indices present in table schema into LSIs and GSIs
* pass request with indices information appended further
@breader124 breader124 requested a review from a team as a code owner May 13, 2023 15:55
@debora-ito debora-ito added needs-review This issue or PR needs review from the team. and removed needs-review This issue or PR needs review from the team. labels May 15, 2023
@breader124
Copy link
Contributor Author

I see you've been assigned as a reviewer @dagnir, would you find some time to give this PR a look, please?

@hamburml
Copy link

I need this :)

@L-Applin L-Applin self-assigned this May 31, 2023
Adrian Chlebosz and others added 2 commits June 3, 2023 00:53
* If there's no information about the billing mode of the new table,
  then it'll be using the PAY_PER_REQUEST one. It means that all
  GSIs related to this table will be doing the same and there's
  no need to hard code any provisioned throughput like it was done
@L-Applin
Copy link
Contributor

L-Applin commented Jun 5, 2023

Some unit test are failing due to the createTable() method. Particularly, the BasicCrudTest class.

com.amazonaws.services.dynamodbv2.exceptions.DynamoDBLocalServiceException: GSI list is empty/invalid

Looking...

@L-Applin
Copy link
Contributor

L-Applin commented Jun 5, 2023

It seems that when no secondary indexes are specified, the create-request made is incorrect and includes empty list for secondary indexes:

@DynamoDbBean
public class Simple {

    private UUID id;
    private String status;
    private String type;

    @DynamoDbPartitionKey
    public UUID getId() {
        return id;
    }
    // other getters-setters ...
}

Also when the TableSchema is created statically like in the failing test case:

        docMappedtable = enhancedClient.table(tableName,
                                              TableSchema.documentSchemaBuilder()
                                                         .addIndexPartitionKey(TableMetadata.primaryIndexName(),
                                                                               "id",
                                                                               AttributeValueType.S)
                                                         .addIndexSortKey(TableMetadata.primaryIndexName(), "sort", AttributeValueType.S)
                                                         .attributeConverterProviders(defaultProvider())
                                                         .build());
        docMappedtable.createTable();

This should be fixed before merging, thanks.

* CreateTableRequest cannot handle empty list of indices of any type. It
  throws exception when given such a list. At the same time, it nicely
  handles the cases when indices lists are null. Make sure then that
  when empty indices list is passed CreateTableOperation, then in the
  CreateTableRequest it's just reflected as null.
@breader124
Copy link
Contributor Author

breader124 commented Jun 8, 2023

I just fixed it. All unit tests for the services-custom/dynamodb-enhanced submodule are now passing

@L-Applin
Copy link
Contributor

L-Applin commented Jun 9, 2023

I just fixed it. All unit tests for the services-custom/dynamodb-enhanced submodule are now passing

Awesome, I'm re-running test now!

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

97.9% 97.9% Coverage
0.0% 0.0% Duplication

@L-Applin
Copy link
Contributor

all checks passes, approved!

@L-Applin L-Applin merged commit ef2aa90 into aws:master Jun 14, 2023
@debora-ito
Copy link
Member

@all-contributors please add @breader124 for code.

@allcontributors
Copy link
Contributor

@debora-ito

I've put up a pull request to add @breader124! 🎉

davidh44 added a commit that referenced this pull request Jun 21, 2023
* Fixed issue with leased connection leaks when threads executing HTTP … (#4066)

* Fixed issue with leased connection leaks when threads executing HTTP connections with Apache HttpClient were interrupted while the connection was in progress.

* Added logic in MakeHttpRequestStage to check and abort request if interrupted

* Add test cases for UrlConnectionHttpClient

* Moved the fix to AfterTransmissionExecutionInterceptorsStage to just close the stream instaed of aborting the reqyest in MakeHttpRequestStage

* Removing test cases related to UrlConnectionHttp since adding depenency in protocol-test for urlConnectionClient cause failues since it uses default Client all the places

* Updated after Zoe's comments

* Now it's possible to configure NettyNioAsyncHttpClient for non blocking DNS (#3990)

* Now it's possible to configure NettyNioAsyncHttpClient in order to use a
non blocking DNS resolver.

* Add package mapping for netty-resolver-dns.

---------

Co-authored-by: Matthew Miller <[email protected]>

* Amazon Connect Service Update: This release adds search APIs for Prompts, Quick Connects and Hours of Operations, which can be used to search for those resources within a Connect Instance.

* AWS Certificate Manager Private Certificate Authority Update: Document-only update to refresh CLI documentation for AWS Private CA. No change to the service.

* Release 2.20.83. Updated CHANGELOG.md, README.md and all pom.xml.

* Add "unsafe" AsyncRequestBody constructors for byte[] and ByteBuffers (#3925)

* Update to next snapshot version: 2.20.84-SNAPSHOT

* Use WeakHashMap in IdleConenctionReaper  (#4087)

* Use WeakHashMap in IdleConenctionReaper to not prevent connection manager from getting GC'd

* Checkstyle fix

* Update S3IntegrationTestBase.java (#4079)

* Amazon Rekognition Update: This release adds support for improved accuracy with user vector in Amazon Rekognition Face Search. Adds new APIs: AssociateFaces, CreateUser, DeleteUser, DisassociateFaces, ListUsers, SearchUsers, SearchUsersByImage. Also adds new face metadata that can be stored: user vector.

* Amazon DynamoDB Update: Documentation updates for DynamoDB

* Amazon FSx Update: Amazon FSx for NetApp ONTAP now supports joining a storage virtual machine (SVM) to Active Directory after the SVM has been created.

* Amazon SageMaker Service Update: Sagemaker Neo now supports compilation for inferentia2 (ML_INF2) and Trainium1 (ML_TRN1) as available targets. With these devices, you can run your workloads at highest performance with lowest cost. inferentia2 (ML_INF2) is available in CMH and Trainium1 (ML_TRN1) is available in IAD currently

* AWS Amplify UI Builder Update: AWS Amplify UIBuilder is launching Codegen UI, a new feature that enables you to generate your amplify uibuilder components and forms.

* Amazon OpenSearch Service Update: This release adds support for SkipUnavailable connection property for cross cluster search

* Amazon DynamoDB Streams Update: Documentation updates for DynamoDB Streams

* Updated endpoints.json and partitions.json.

* Release 2.20.84. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.85-SNAPSHOT

* docs: add scrocquesel as a contributor for code (#4091)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

---------

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>
Co-authored-by: Debora N. Ito <[email protected]>

* AWS CloudTrail Update: This feature allows users to view dashboards for CloudTrail Lake event data stores.

* AWS WAFV2 Update: You can now detect and block fraudulent account creation attempts with the new AWS WAF Fraud Control account creation fraud prevention (ACFP) managed rule group AWSManagedRulesACFPRuleSet.

* AWS Well-Architected Tool Update: AWS Well-Architected now supports Profiles that help customers prioritize which questions to focus on first by providing a list of prioritized questions that are better aligned with their business goals and outcomes.

* Amazon Lightsail Update: This release adds pagination for the Get Certificates API operation.

* Amazon Verified Permissions Update: GA release of Amazon Verified Permissions.

* EC2 Image Builder Update: Change the Image Builder ImagePipeline dateNextRun field to more accurately describe the data.

* Amazon CodeGuru Security Update: Initial release of Amazon CodeGuru Security APIs

* Amazon Simple Storage Service Update: Integrate double encryption feature to SDKs.

* Elastic Disaster Recovery Service Update: Added APIs to support network replication and recovery using AWS Elastic Disaster Recovery.

* AWS SimSpace Weaver Update: This release fixes using aws-us-gov ARNs in API calls and adds documentation for snapshot APIs.

* AWS SecurityHub Update: Add support for Security Hub Automation Rules

* Amazon Elastic Compute Cloud Update: This release introduces a new feature, EC2 Instance Connect Endpoint, that enables you to connect to a resource over TCP, without requiring the resource to have a public IPv4 address.

* Updated endpoints.json and partitions.json.

* Release 2.20.85. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.86-SNAPSHOT

* Create secondary indices based on table bean annotations (#3923) (#4004)

* Create secondary indices based on table bean annotations (#3923)

* detect and group indices present in table schema into LSIs and GSIs
* pass request with indices information appended further

* Remove specifying provisioned throughput for GSIs (#3923)

* If there's no information about the billing mode of the new table,
  then it'll be using the PAY_PER_REQUEST one. It means that all
  GSIs related to this table will be doing the same and there's
  no need to hard code any provisioned throughput like it was done

* Allow passing empty indices list to CreateTableOperation (#3923)

* CreateTableRequest cannot handle empty list of indices of any type. It
  throws exception when given such a list. At the same time, it nicely
  handles the cases when indices lists are null. Make sure then that
  when empty indices list is passed CreateTableOperation, then in the
  CreateTableRequest it's just reflected as null.

---------

Co-authored-by: Adrian Chlebosz <[email protected]>
Co-authored-by: Olivier L Applin <[email protected]>

* Add EnhancedType parameters to static builder methods of StaticTableSchema and StaticImmitableTableSchema (#4077)

* Amazon Elastic File System Update: Documentation updates for EFS.

* Amazon GuardDuty Update: Updated descriptions for some APIs.

* Amazon Location Service Update: Amazon Location Service adds categories to places, including filtering on those categories in searches. Also, you can now add metadata properties to your geofences.

* AWS Audit Manager Update: This release introduces 2 Audit Manager features: CSV exports and new manual evidence options. You can now export your evidence finder results in CSV format. In addition, you can now add manual evidence to a control by entering free-form text or uploading a file from your browser.

* Updated endpoints.json and partitions.json.

* Release 2.20.86. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.87-SNAPSHOT

* EnumAttributeConverter: enums can be identified by toString() or name(). toString() is the default for backward compatibility (#3971)

Co-authored-by: Zoe Wang <[email protected]>

* AWS Application Discovery Service Update: Add Amazon EC2 instance recommendations export

* AWS Account Update: Improve pagination support for ListRegions

* Amazon Simple Storage Service Update: This release adds SDK support for request-payer request header and request-charged response header in the "GetBucketAccelerateConfiguration", "ListMultipartUploads", "ListObjects", "ListObjectsV2" and "ListObjectVersions" S3 APIs.

* Amazon Connect Service Update: Updates the *InstanceStorageConfig APIs to support a new ResourceType: SCREEN_RECORDINGS to enable screen recording and specify the storage configurations for publishing the recordings. Also updates DescribeInstance and ListInstances APIs to include InstanceAccessUrl attribute in the API response.

* AWS Identity and Access Management Update: Documentation updates for AWS Identity and Access Management (IAM).

* Release 2.20.87. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.88-SNAPSHOT

* Fix the StackOverflowException in WaiterExecutor in case of large retries count. (#3956)

* Move checksum calculation from afterMarshalling to modifyHttpRequest (#4108)

* Update HttpChecksumRequiredInterceptor

* Update HttpChecksumInHeaderInterceptor

* Update tests and remove constant

* Add back constant to resolve japicmp

* Add back javadocs

* docs: add dave-fn as a contributor for code (#4092)

* docs: update README.md [skip ci]

* docs: update .all-contributorsrc [skip ci]

* Removing unnecessary vscode file

---------

Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>
Co-authored-by: Debora N. Ito <[email protected]>

* Amazon Route 53 Domains Update: Update MaxItems upper bound to 1000 for ListPricesRequest

* Amazon EC2 Container Service Update: Documentation only update to address various tickets.

* AWS CloudFormation Update: Specify desired CloudFormation behavior in the event of ChangeSet execution failure using the CreateChangeSet OnStackFailure parameter

* AWS Price List Service Update: This release updates the PriceListArn regex pattern.

* AWS Glue Update: This release adds support for creating cross region table/database resource links

* Amazon Elastic Compute Cloud Update: API changes to AWS Verified Access to include data from trust providers in logs

* Amazon SageMaker Service Update: Amazon Sagemaker Autopilot releases CreateAutoMLJobV2 and DescribeAutoMLJobV2 for Autopilot customers with ImageClassification, TextClassification and Tabular problem type config support.

* Release 2.20.88. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.89-SNAPSHOT

* AWS Lambda Update: This release adds RecursiveInvocationException to the Invoke API and InvokeWithResponseStream API.

* AWS Config Update: Updated ResourceType enum with new resource types onboarded by AWS Config in May 2023.

* Amazon Appflow Update: This release adds new API to reset connector metadata cache

* Amazon Elastic Compute Cloud Update: Adds support for targeting Dedicated Host allocations by assetIds in AWS Outposts

* Amazon Redshift Update: Added support for custom domain names for Redshift Provisioned clusters. This feature enables customers to create a custom domain name and use ACM to generate fully secure connections to it.

* Updated endpoints.json and partitions.json.

* Release 2.20.89. Updated CHANGELOG.md, README.md and all pom.xml.

* Update to next snapshot version: 2.20.90-SNAPSHOT

* Move QueryParametersToBodyInterceptor to front of interceptor chain (#4109)

* Move QueryParametersToBodyInterceptor to front of interceptor chain

* Move customization.config interceptors to front of interceptor chain - for query protocols

* Refactoring

* Add codegen tests

* Refactoring

* Refactoring

---------

Co-authored-by: John Viegas <[email protected]>
Co-authored-by: Martin <[email protected]>
Co-authored-by: Matthew Miller <[email protected]>
Co-authored-by: AWS <>
Co-authored-by: aws-sdk-java-automation <[email protected]>
Co-authored-by: Stephen Flavin <[email protected]>
Co-authored-by: Zoe Wang <[email protected]>
Co-authored-by: allcontributors[bot] <46447321+allcontributors[bot]@users.noreply.github.com>
Co-authored-by: Debora N. Ito <[email protected]>
Co-authored-by: Adrian Chlebosz <[email protected]>
Co-authored-by: Adrian Chlebosz <[email protected]>
Co-authored-by: Olivier L Applin <[email protected]>
Co-authored-by: Benjamin Maizels <[email protected]>
Co-authored-by: flitt <[email protected]>
L-Applin added a commit that referenced this pull request Jul 24, 2023
* Create secondary indices based on table bean annotations (#3923)

* detect and group indices present in table schema into LSIs and GSIs
* pass request with indices information appended further

* Remove specifying provisioned throughput for GSIs (#3923)

* If there's no information about the billing mode of the new table,
  then it'll be using the PAY_PER_REQUEST one. It means that all
  GSIs related to this table will be doing the same and there's
  no need to hard code any provisioned throughput like it was done

* Allow passing empty indices list to CreateTableOperation (#3923)

* CreateTableRequest cannot handle empty list of indices of any type. It
  throws exception when given such a list. At the same time, it nicely
  handles the cases when indices lists are null. Make sure then that
  when empty indices list is passed CreateTableOperation, then in the
  CreateTableRequest it's just reflected as null.

---------

Co-authored-by: Adrian Chlebosz <[email protected]>
Co-authored-by: Olivier L Applin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants