-
Notifications
You must be signed in to change notification settings - Fork 66
Additional group discovery and creation #342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional group discovery and creation #342
Conversation
f8528d9
to
31105c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great stuff, but I strongly suggest you write some user documentation about how it would be used (which forces you through the process of explaining it to customers). You have some good basic mechanisms but I am not sure how understandable they would be and what interactions these new features would have with existing features. Going through what problems you are solving and how to use these features to solve them would be helpful.
There's already documentation covering the new features (in en/user-manual/advanced_usage.md). I agree that it needs some additional information and perhaps some use case examples. I reviewed it today because I needed a refresher on how the new config works, and I found some of it ambiguous and confusing. Before I update the docs, I need to remove the "direct membership" query. The extra per-user LDAP query adds a huge amount of time to the LDAP user query process, and it's causing issues with getting a full list of users. On my two test AD systems, the server returns a null pagination cursor after the LDAP connector retrieves two pages from the directory. It only does this when the additional query is made to get each user's groups. Instead, I'm going to append "memberOf" to the attribute list and get the groups from there. It's less configurable, but it will work for our primary use case for now. |
@phil-levy I've made some improvements to the documentation - see them here: https://github.com/adorton-adobe/user-sync.py/blob/feature/group-creation/docs/en/user-manual/advanced_configuration.md#additional-group-options Will you please take a look and let me know if it needs more work? Also - have you reviewed the code changes? If so, have you found any issues or areas of concern? |
We need an issue to hold the spec for this.
479a168
to
cb40834
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @adorton-adobe, sorry I didn't get this reviewed before you merged it, but I do have some suggestions so hopefully these will be helpful in getting to RC2.
In general this looks really clean and very consistent with how the tool works already. There are just a few areas where I suggest you clean things up.
Nice work.
@@ -457,6 +458,15 @@ def get_rule_options(self): | |||
default_country_code = directory_config.get_string('default_country_code', True) | |||
if default_country_code: | |||
options['default_country_code'] = default_country_code | |||
additional_groups = directory_config.get_list('additional_groups', True) or [] | |||
additional_groups = [{'source': re.compile(r['source']), 'target': r['target']} for r in additional_groups] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re.compile can throw (e.g., syntax_error
). you should catch the error and exit gracefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
if sync_options: | ||
options['auto_create'] = sync_options.get_bool('auto_create', True) | ||
if not new_account_type: | ||
new_account_type = user_sync.identity_type.ENTERPRISE_IDENTITY_TYPE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really unclear. If you are trying to make sure there's a default new_account_type in the absence of a directory_config
section then just initialize new_account_type
to user_sync.identity_type.ENTERPRISE_IDENTITY_TYPE
instead of None at line 447.
But I also think you may be guarding against a case that doesn't exist? Isn't the directory configuration section actually required as part of the config? So the if at the front of this entire section (if directory_config
) is probably not needed either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what is going on here either. I'm responsible for this change, but I don't remember the rationale behind it. I agree that the new_account_type check isn't needed, so I'll remove it.
I'll also see if the directory_config check is necessary. The only reason that variable would be None is if directory_users
is omitted from the config file. I would think that the config handler should check for that already so we shouldn't need to double check it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm keeping the directory_config
check because it's a convenient place to raise an exception if directory_users
isn't specified. Otherwise, the error bubbles up from DictConfig
, which will throw an error when the first expected directory config key isn't found. Raising the exception here will make it more clear that directory_users
is missing. (while I'm at it, I'll do the same thing for adobe_users
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a fine approach.
@@ -289,6 +292,18 @@ def iter_users(self, users_filter, extended_attributes): | |||
elif last_attribute_name: | |||
self.logger.warning('No country code attribute (%s) for user with dn: %s', last_attribute_name, dn) | |||
|
|||
uid_value = LDAPValueFormatter.get_attribute_value(record, six.text_type('uid')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this for? Is it for use in the extension stuff or something like that? A comment is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appears to be a remnant of the original design, which had the LDAP connector make an additional LDAP query to get direct membership groups.
See 73592db
I don't think we need this anymore since we're now using memberOf
to get direct groups.
except umapi_client.UnavailableError as e: | ||
raise AssertionException("Error contacting UMAPI server: %s" % e) | ||
|
||
def create_group(self, name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should allow passing in the source group name so that you can use it in the comment: "Created to match directory group ... by User Sync Tool"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it would make sense to do that here. Auto-creation is designed to be a separate feature from the additional group discovery/mapping feature. It can be enabled independently of it and be used solely to auto-create Adobe groups targeted in the group mapping and/or the extension config. (conversely, the "additional groups" option can be enabled without group auto-creation)
It would probably make more sense to log the "additional group" mapping somewhere in the additional group resolution workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So then make the source group name be an optional argument and don't use it if it's not passed. The comment you're currently using is fairly useless in the auto-mapping case and would be a lot better if it had the source group.
self.logger.info("Auto create user-group enabled: Creating %s" % mapped_group) | ||
try: | ||
# create group | ||
res = umapi_connector.create_group(mapped_group) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment above: you should pass the source group name in to create_group so the group description can say what source group it comes from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and mine :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @adorton-adobe, one more -- most important of all -- item: I don't know why this didn't make the first review.
if not self.push_umapi: | ||
umapi_info, umapi_connector = self.get_umapi_info( | ||
PRIMARY_UMAPI_NAME), umapi_connectors.get_primary_connector() | ||
mapped_groups = umapi_info.get_non_normalize_mapped_groups() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, somehow this comment got lost from my review.
I think what you are trying to do with respecting LDAP group case and "non-normalized" groups is going to get clients into trouble, and it exacerbates a possible case that you should be checking for currently but aren't.
Because of the fact that you're doing case-sensitive matching, but more generally because of the way search/replace works, it's possible for more than one source group to map, via the same or different rules, to the same target group, and it's also possible for there to be two target groups that differ only in case. That's going to be really bad for clients, who almost certainly don't want that to happen.
So I think you need to build logic in this function that catches this issue, so that if two different source groups map into the same target group you throw an error. (If it turns out that a customer really wants to do this, then they can accomplish it by using |
patterns in their source expressions.) And I think that target-group mapping needs to be case-insensitive (actually normalized), so that you are always mapping source groups to their normalized group name, and the collision detection happens after normalization.
Finally, as noted elsewhere, I think you should annotate each target group with the name of the source group it was created from. And I think you should really create each group in normalized form, but of course if you feel the target case is important you can create them as they were mapped. But don't keep track of them that way, just keep track with each group of the case to be used to create it.
All of this normalization and error checking should happen after all the source groups are read but before any of the creations happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - I will implement normalization and collision checking. I'm planning on refactoring additional group resolution so that it can work with other connector types (by moving it to rules.py or maybe the parent directory connector class). After that is done, I'll address normalization and collision detection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adobeDan To implement collision checking, I created a data structure to keep track of the mapped additional target groups and the source groups from which they were derived. If any target groups were derived from more than one source group then it throws an exception.
With this approach, I don't think it is possible to get around it with an or |
pattern in the regular expression. I can't think of any alternative approach that would allow it and still do accurate collision checking.
I don't think that'd a problem, but I wanted to make you aware.
This feature adds mechanism for the sync tool to discover additional, non-mapped groups and add them to sync. It also automatic group creation.
Configuration details can be found in
1 user-sync-config.yml
. Additional information can be found in the "advanced configuration" manual page.It introduces a new config option that specifies a set of one or more rules to identify and rename groups from this query to target for sync. These groups are checked in the
memberOf
user attribute. Rules to identify and rename these extra groups are defined inuser-sync-config.yml
.Groups can also optionally be auto-created. If auto-create is enabled, then any Adobe group targeted through the additional group mechanism, group mapping or extension config will be created if the following conditions are met.
The auto-create boolean is defined in
user-sync-config.yml
.Notes:
Fixes #339