You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The fullsync on the confluence connector only pulls 50 documents if a CQL is set.
To Reproduce
Set a CQL as an "advanced rule" in the connector "sync rules" for example:
[
{
"query": "created >= now('-5y')"
}
]
Expected behavior
Pull the confluence content of the last 5 years (obvious overkill but that is a different story)
Environment
8.17.3
Solution
I have been playing around with the "paginated_api_call" function in "confluence.py" and have noticed that the function looks for a next link.
However in the /api/search call this does not actually seem to exist according to the API documentation: https://docs.atlassian.com/atlassian-confluence/REST/6.6.0/#content-search
It seems that pagination for a search has to be done with moving of the start window.
quick prof of concept while still keeping the next link if it would be needed by another function:
async def paginated_api_call(self, url_name, **url_kwargs):
"""Make a paginated API call for Confluence objects using the passed url_name.
Args:
url_name (str): URL Name to identify the API endpoint to hit
Yields:
response: JSON response.
"""
base_url = os.path.join(self.host_url, URLS[url_name].format(**url_kwargs))
start = 0
while True:
try:
url = f"{base_url}&start={start}"
print("Starting Pagination for API endpoint: ", url)
self._logger.debug(f"Starting pagination for API endpoint {url}")
response = await self.api_call(url=url)
json_response = await response.json()
#print(json_response)
links = json_response.get("_links")
yield json_response
print(links.get("next"))
if links.get("next"):
print("Next URL Found")
url = os.path.join(
self.host_url,
links.get("next")[1:],
)
elif json_response.get("start") + json_response.get("size") < json_response.get("totalSize"):
print("Calculating next URL")
start = json_response.get("start") + json_response.get("size")
url = f"{base_url}&start={start}"
print("Next URL: ", url)
else:
print("No more data to fetch")
return
except Exception as exception:
print("Exception: ", exception)
self._logger.warning(
f"Skipping data for type {url_name} from {base_url}. Exception: {exception}."
)
break
While debugging this I also found another issue in the function "search_by_query", it never is checked if "entity_details" exists, so if entity details is none, it will fail.
I fixed this with an additional condition
async def search_by_query(self, query):
async for entity in self.confluence_client.search_by_query(query=query):
# entity can be space or content
entity_details = entity.get(SPACE) or entity.get(CONTENT)
if not entity_details:
continue
if (entity_details.get("type", "") == "attachment"
and entity_details.get("container", {}).get("title") is None
):
continue
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
Bug Description
The fullsync on the confluence connector only pulls 50 documents if a CQL is set.
To Reproduce
Set a CQL as an "advanced rule" in the connector "sync rules" for example:
[
{
"query": "created >= now('-5y')"
}
]
Expected behavior
Pull the confluence content of the last 5 years (obvious overkill but that is a different story)
Environment
8.17.3
Solution
I have been playing around with the "paginated_api_call" function in "confluence.py" and have noticed that the function looks for a next link.
However in the /api/search call this does not actually seem to exist according to the API documentation:
https://docs.atlassian.com/atlassian-confluence/REST/6.6.0/#content-search
It seems that pagination for a search has to be done with moving of the start window.
quick prof of concept while still keeping the next link if it would be needed by another function:
While debugging this I also found another issue in the function "search_by_query", it never is checked if "entity_details" exists, so if entity details is none, it will fail.
I fixed this with an additional condition
The text was updated successfully, but these errors were encountered: