-
Notifications
You must be signed in to change notification settings - Fork 418
MSC2197: Search Filtering in Federation /publicRooms #2197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
turt2live
merged 8 commits into
matrix-org:master
from
reivilibre:rei/msc_filter_over_fed
Aug 20, 2019
Merged
Changes from 7 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
4c22eb8
MSC for Search Filtering in Federation /publicRooms
reivilibre 36e43ee
Rewrap lines in MSC2917 to 80 chars wide
reivilibre 493bb06
MSC2197: update with privacy perspective
reivilibre 60cbc45
Addresses some of Andrew's comments
reivilibre 97f856d
Domain name is potentially personally-identifying
reivilibre 7e85b9d
Acknowledge other potential error responses for fallback
reivilibre 4219e27
Drop the hard SHOULD
reivilibre 76f9196
Address @richvdh's comments
reivilibre File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
156 changes: 156 additions & 0 deletions
156
proposals/2197-search_filter_in_federation_publicrooms.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,156 @@ | ||
| # MSC2197 – Search Filtering in Public Room Directory over Federation | ||
|
|
||
| This MSC proposes introducing the `POST` method to the `/publicRooms` Federation | ||
| API endpoint, including a `filter` argument which allows server-side filtering | ||
| of rooms. | ||
|
|
||
| We are motivated by the opportunity to make searching the public Room Directory | ||
| more efficient over Federation. | ||
|
|
||
| ## Motivation | ||
|
|
||
| Although the Client-Server API includes the filtering capability in | ||
| `/publicRooms`, the Federation API currently does not. | ||
|
|
||
| This leads to a situation that is wasteful of effort and network traffic for | ||
| both homeservers; searching a remote server involves first downloading its | ||
| entire room list and then filtering afterwards. | ||
|
|
||
| ## Proposal | ||
|
|
||
| Having a filtered `/publicRooms` API endpoint means that irrelevant or | ||
| uninteresting rooms can be excluded from a room directory query response. | ||
| In turn, this means that these room directory query responses can be generated | ||
| more quickly and then, due to their smaller size, transmitted over the network | ||
| more quickly. | ||
|
|
||
| These benefits have been exploited in the Client-Server API, which implements | ||
| search filtering using the `filter` JSON body parameter in the `POST` method on | ||
| the `/publicRooms` endpoint. | ||
|
|
||
| Ignoring the `server` parameter in the Client-Server API, the following specific | ||
| differences are noticed between the Client-Server and Federation API's | ||
| `/publicRooms` endpoints: | ||
|
|
||
| * the Federation API endpoint only accepts the `GET` method whereas the | ||
| Client-Server API accepts the `POST` method as well. | ||
| * the Federation API accepts `third_party_instance_id` and | ||
| `include_all_networks` parameters through the `GET` method, whereas the | ||
| Client-Server API only features these in the `POST` method. | ||
|
|
||
| This MSC proposes to introduce support for the `POST` method in the Federation | ||
| API's `/publicRooms` endpoint, with all but one of the parameters from that of | ||
reivilibre marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| the Client-Server API. | ||
|
|
||
| The parameter that is intentionally omitted is the `server` query parameter, as | ||
| it does not make sense to include it – the requesting homeserver could make a | ||
| direct request instead of requesting that a request be relayed. | ||
|
|
||
| The parameters which are copied, however, shall have the same semantics as | ||
| they do in the Client-Server API. | ||
|
|
||
| In the interest of clarity, the proposed parameter set is listed below, along | ||
| with a repetition of the definitions of used substructures. The response format | ||
| has been omitted as it is the same as that of the current Client-Server and | ||
| Federation APIs, which do not differ in this respect. | ||
|
|
||
| ### `POST /_matrix/federation/v1/publicRooms` | ||
|
|
||
| #### Query Parameters | ||
|
|
||
| There are no query parameters. Notably, we intentionally do not inherit the | ||
| `server` query parameter from the Client-Server API. | ||
anoadragon453 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| #### JSON Body Parameters | ||
|
|
||
| * `limit` (`integer`): Limit the number of search results returned. | ||
| * `since` (`string`): A pagination token from a previous request, allowing | ||
| clients to get the next (or previous) batch of rooms. The direction of | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤔 why does the API let us paginate in both directions?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't ask me! It's how the C-S API is, however… |
||
| pagination is specified solely by which token is supplied, rather than via an | ||
| explicit flag. | ||
| * `filter` (`Filter`): Filter to apply to the results. | ||
| * `include_all_networks` (`boolean`): Whether or not to include all known | ||
| networks/protocols from application services on the homeserver. | ||
| Defaults to false. | ||
| * `third_party_instance_id` (`boolean`): The specific third party | ||
| network/protocol to request from the homeserver. | ||
| Can only be used if `include_all_networks` is false. | ||
|
|
||
| ### `Filter` Parameters | ||
|
|
||
| * `generic_search_term` (`string`): A string to search for in the room metadata, | ||
| e.g. name, topic, canonical alias etc. (Optional). | ||
|
|
||
| ## Tradeoffs | ||
|
|
||
| An alternative approach might be for implementations to carry on as they are but | ||
| also cache (and potentially index) remote homeservers' room directories. | ||
| This would not require a spec change. | ||
|
|
||
| However, this would be unsatisfactory because it would lead to outdated room | ||
| directory results and/or caches that provide no benefit (as room directory | ||
| searches are generally infrequent enough that a cache would be outdated before | ||
| being reused, on small – if not most – homeservers). | ||
|
|
||
| ## Potential issues | ||
|
|
||
| ### Backwards Compatibility | ||
|
|
||
| After this proposal is implemented, outdated homeservers will still exist which | ||
| do not support the room filtering functionality specified in this MSC. In this | ||
| case, homeservers will have to fall-back to downloading the entire room | ||
| directory and performing the filtering themselves, as currently happens. | ||
| This is not considered a problem since it will not lead to a situation that is | ||
| any worse than the current one, and it is expected that large homeservers | ||
| – which cause the most work with the current search implementations – | ||
| would be quick to upgrade to support this feature once it is available. | ||
|
|
||
| In addition, as the `POST` method was not previously accepted on the | ||
reivilibre marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| `/publicRooms` endpoint over federation, then it is possible to fall back to the | ||
reivilibre marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| old behaviour, if one of the following errors is encountered: | ||
|
|
||
| - an `M_UNRECOGNIZED` standard error response `errcode` (this is what would be | ||
reivilibre marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| typically expected in this situation) | ||
| - an `M_NOT_FOUND` standard error response | ||
reivilibre marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - a `404 Not Found` HTTP error response | ||
reivilibre marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - a `405 Method Not Allowed` HTTP error response | ||
|
|
||
| ## Security considerations | ||
|
|
||
| There are no known security considerations. | ||
|
|
||
| ## Privacy considerations | ||
|
|
||
| At current, remote homeservers do not learn about what a user has searched for. | ||
reivilibre marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| However, under this proposal, in the context of using the Federation API to | ||
| forward on queries from the Client-Server API, a client's homeserver would end | ||
| up sharing the client's search terms with a remote homeserver, which may not be | ||
| operated by the same party or even trusted. For example, users' search terms | ||
| could be logged. | ||
|
|
||
| The privacy implications of this proposal are not overly major, as the data | ||
| that's being shared is [\[1\]][1]: | ||
|
|
||
| - only covered by GDPR if: | ||
| - the search terms contain personal data, or | ||
| - the user's homeserver IP address or domain name is uniquely identifying | ||
| (because it's a single-person homeserver, perhaps) | ||
| - likely to be *expected* to be shared with the remote homeserver | ||
|
|
||
| [1]: https://github.com/matrix-org/matrix-doc/pull/2197#issuecomment-517641751 | ||
|
|
||
| For the sake of clarity, clients are strongly encouraged to display a warning | ||
| that a remote search will take the user's data outside the jurisdiction of their | ||
| own homeserver, before using the `server` parameter of the Client-Server API | ||
| `/publicRooms`, as it can be assumed that this will lead to the server invoking | ||
| the Federation API's `/publicRooms` – on the specified remote server – with the | ||
| user's search terms. | ||
|
|
||
| ## Conclusion | ||
|
|
||
| By allowing homeservers to pass on search filters, we enable remote homeservers' | ||
| room directories to be efficiently searched, because, realistically speaking, | ||
| only the remote homeserver is in a position to be able to perform search | ||
| efficiently, by taking advantage of indexing and other such optimisations. | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.