Skip to content

Additional group discovery and creation #342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

adorton-adobe
Copy link
Collaborator

@adorton-adobe adorton-adobe commented Feb 28, 2018

This feature adds mechanism for the sync tool to discover additional, non-mapped groups and add them to sync. It also automatic group creation.

Configuration details can be found in 1 user-sync-config.yml. Additional information can be found in the "advanced configuration" manual page.

It introduces a new config option that specifies a set of one or more rules to identify and rename groups from this query to target for sync. These groups are checked in the memberOf user attribute. Rules to identify and rename these extra groups are defined in user-sync-config.yml.

  additional_groups:
    - source: "ACL-(.+)"
      target: "ACL-Grp-(\1)"
    - source: "(.+)-ACL"
      target: "ACL-Grp-(\1)"

Groups can also optionally be auto-created. If auto-create is enabled, then any Adobe group targeted through the additional group mechanism, group mapping or extension config will be created if the following conditions are met.

  • Group is targeted for at least one user
  • Group does not already exist

The auto-create boolean is defined in user-sync-config.yml.

  group_sync_options:
    auto_create: False

Notes:

  • Auto-deletion of groups is not supported at this time
  • If all users are removed from an "additional group", the sync tool will not know that it needs to remove users from its corresponding Adobe group (the group must be targeted for at least one user)
  • The proposed "additional groups" LDAP query was removed for performance reason and because it made the overall group query buggy

Fixes #339

@adorton-adobe adorton-adobe force-pushed the feature/group-creation branch from f8528d9 to 31105c3 Compare August 14, 2018 21:48
Copy link
Contributor

@phil-levy phil-levy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great stuff, but I strongly suggest you write some user documentation about how it would be used (which forces you through the process of explaining it to customers). You have some good basic mechanisms but I am not sure how understandable they would be and what interactions these new features would have with existing features. Going through what problems you are solving and how to use these features to solve them would be helpful.

@adorton-adobe
Copy link
Collaborator Author

There's already documentation covering the new features (in en/user-manual/advanced_usage.md). I agree that it needs some additional information and perhaps some use case examples. I reviewed it today because I needed a refresher on how the new config works, and I found some of it ambiguous and confusing.

Before I update the docs, I need to remove the "direct membership" query. The extra per-user LDAP query adds a huge amount of time to the LDAP user query process, and it's causing issues with getting a full list of users. On my two test AD systems, the server returns a null pagination cursor after the LDAP connector retrieves two pages from the directory. It only does this when the additional query is made to get each user's groups. Instead, I'm going to append "memberOf" to the attribute list and get the groups from there. It's less configurable, but it will work for our primary use case for now.

@adorton-adobe
Copy link
Collaborator Author

@phil-levy I've made some improvements to the documentation - see them here: https://github.com/adorton-adobe/user-sync.py/blob/feature/group-creation/docs/en/user-manual/advanced_configuration.md#additional-group-options

Will you please take a look and let me know if it needs more work?

Also - have you reviewed the code changes? If so, have you found any issues or areas of concern?

adobeDan and others added 26 commits August 29, 2018 18:55
We need an issue to hold the spec for this.
@adorton-adobe adorton-adobe force-pushed the feature/group-creation branch from 479a168 to cb40834 Compare August 30, 2018 00:56
@adorton-adobe adorton-adobe merged commit a85b517 into adobe-apiplatform:v2 Aug 31, 2018
Copy link
Contributor

@adobeDan adobeDan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @adorton-adobe, sorry I didn't get this reviewed before you merged it, but I do have some suggestions so hopefully these will be helpful in getting to RC2.

In general this looks really clean and very consistent with how the tool works already. There are just a few areas where I suggest you clean things up.

Nice work.

@@ -457,6 +458,15 @@ def get_rule_options(self):
default_country_code = directory_config.get_string('default_country_code', True)
if default_country_code:
options['default_country_code'] = default_country_code
additional_groups = directory_config.get_list('additional_groups', True) or []
additional_groups = [{'source': re.compile(r['source']), 'target': r['target']} for r in additional_groups]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re.compile can throw (e.g., syntax_error). you should catch the error and exit gracefully.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

if sync_options:
options['auto_create'] = sync_options.get_bool('auto_create', True)
if not new_account_type:
new_account_type = user_sync.identity_type.ENTERPRISE_IDENTITY_TYPE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really unclear. If you are trying to make sure there's a default new_account_type in the absence of a directory_config section then just initialize new_account_type to user_sync.identity_type.ENTERPRISE_IDENTITY_TYPE instead of None at line 447.

But I also think you may be guarding against a case that doesn't exist? Isn't the directory configuration section actually required as part of the config? So the if at the front of this entire section (if directory_config) is probably not needed either.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is going on here either. I'm responsible for this change, but I don't remember the rationale behind it. I agree that the new_account_type check isn't needed, so I'll remove it.

I'll also see if the directory_config check is necessary. The only reason that variable would be None is if directory_users is omitted from the config file. I would think that the config handler should check for that already so we shouldn't need to double check it here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm keeping the directory_config check because it's a convenient place to raise an exception if directory_users isn't specified. Otherwise, the error bubbles up from DictConfig, which will throw an error when the first expected directory config key isn't found. Raising the exception here will make it more clear that directory_users is missing. (while I'm at it, I'll do the same thing for adobe_users)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a fine approach.

@@ -289,6 +292,18 @@ def iter_users(self, users_filter, extended_attributes):
elif last_attribute_name:
self.logger.warning('No country code attribute (%s) for user with dn: %s', last_attribute_name, dn)

uid_value = LDAPValueFormatter.get_attribute_value(record, six.text_type('uid'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for? Is it for use in the extension stuff or something like that? A comment is needed here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appears to be a remnant of the original design, which had the LDAP connector make an additional LDAP query to get direct membership groups.

See 73592db

I don't think we need this anymore since we're now using memberOf to get direct groups.

except umapi_client.UnavailableError as e:
raise AssertionException("Error contacting UMAPI server: %s" % e)

def create_group(self, name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should allow passing in the source group name so that you can use it in the comment: "Created to match directory group ... by User Sync Tool"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would make sense to do that here. Auto-creation is designed to be a separate feature from the additional group discovery/mapping feature. It can be enabled independently of it and be used solely to auto-create Adobe groups targeted in the group mapping and/or the extension config. (conversely, the "additional groups" option can be enabled without group auto-creation)

It would probably make more sense to log the "additional group" mapping somewhere in the additional group resolution workflow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So then make the source group name be an optional argument and don't use it if it's not passed. The comment you're currently using is fairly useless in the auto-mapping case and would be a lot better if it had the source group.

self.logger.info("Auto create user-group enabled: Creating %s" % mapped_group)
try:
# create group
res = umapi_connector.create_group(mapped_group)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above: you should pass the source group name in to create_group so the group description can say what source group it comes from.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and mine :)

Copy link
Contributor

@adobeDan adobeDan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @adorton-adobe, one more -- most important of all -- item: I don't know why this didn't make the first review.

if not self.push_umapi:
umapi_info, umapi_connector = self.get_umapi_info(
PRIMARY_UMAPI_NAME), umapi_connectors.get_primary_connector()
mapped_groups = umapi_info.get_non_normalize_mapped_groups()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, somehow this comment got lost from my review.

I think what you are trying to do with respecting LDAP group case and "non-normalized" groups is going to get clients into trouble, and it exacerbates a possible case that you should be checking for currently but aren't.

Because of the fact that you're doing case-sensitive matching, but more generally because of the way search/replace works, it's possible for more than one source group to map, via the same or different rules, to the same target group, and it's also possible for there to be two target groups that differ only in case. That's going to be really bad for clients, who almost certainly don't want that to happen.

So I think you need to build logic in this function that catches this issue, so that if two different source groups map into the same target group you throw an error. (If it turns out that a customer really wants to do this, then they can accomplish it by using | patterns in their source expressions.) And I think that target-group mapping needs to be case-insensitive (actually normalized), so that you are always mapping source groups to their normalized group name, and the collision detection happens after normalization.

Finally, as noted elsewhere, I think you should annotate each target group with the name of the source group it was created from. And I think you should really create each group in normalized form, but of course if you feel the target case is important you can create them as they were mapped. But don't keep track of them that way, just keep track with each group of the case to be used to create it.

All of this normalization and error checking should happen after all the source groups are read but before any of the creations happen.

Copy link
Collaborator Author

@adorton-adobe adorton-adobe Sep 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - I will implement normalization and collision checking. I'm planning on refactoring additional group resolution so that it can work with other connector types (by moving it to rules.py or maybe the parent directory connector class). After that is done, I'll address normalization and collision detection.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adobeDan To implement collision checking, I created a data structure to keep track of the mapped additional target groups and the source groups from which they were derived. If any target groups were derived from more than one source group then it throws an exception.

With this approach, I don't think it is possible to get around it with an or | pattern in the regular expression. I can't think of any alternative approach that would allow it and still do accurate collision checking.

I don't think that'd a problem, but I wanted to make you aware.

@adorton-adobe adorton-adobe deleted the feature/group-creation branch September 4, 2018 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants