Skip to content

chore: Python v2 migration #135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Aug 18, 2024
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ jobs:
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: pip install .[dev]
run: pip install pylint
- name: Lint
run: make lint

Expand Down
8 changes: 8 additions & 0 deletions .github/workflows/speakeasy_sdk_generation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,11 @@ jobs:
github_access_token: ${{ secrets.GITHUB_TOKEN }}
pypi_token: ${{ secrets.PYPI_TOKEN }}
speakeasy_api_key: ${{ secrets.SPEAKEASY_API_KEY }}
# We don't need this unless we're modifying the generated code
patch-custom-code:
runs-on: ubuntu-latest
needs: [generate]
steps:
- name: Patch in custom code after regenerating
run: make patch-custom-code

2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
pyrightconfig.json
venv/
src/*.egg-info/
__pycache__/
.pytest_cache/
.python-version
.DS_Store

# human-added igore files
.ipynb_checkpoints/
.idea/
Expand Down
71 changes: 45 additions & 26 deletions .speakeasy/gen.lock
Original file line number Diff line number Diff line change
@@ -1,57 +1,76 @@
lockVersion: 2.0.0
id: 8b5fa338-9106-4734-abf0-e30d67044a90
management:
docChecksum: 17bd23e4247d7b65a92813afd1252693
docChecksum: a6ff17ff485bb4b5884d75af244e18a1
docVersion: 1.0.44
speakeasyVersion: 1.361.1
generationVersion: 2.393.4
releaseVersion: 0.25.5
configChecksum: 6b4c1555edde75f4f1e422e49a07c208
speakeasyVersion: 1.346.0
generationVersion: 2.379.3
releaseVersion: 0.26.0
configChecksum: b455a107798458739736480ac5f51e86
repoURL: https://github.com/Unstructured-IO/unstructured-python-client.git
repoSubDirectory: .
installationURL: https://github.com/Unstructured-IO/unstructured-python-client.git
published: true
features:
python:
additionalDependencies: 0.1.0
constsAndDefaults: 0.1.4
core: 4.8.4
examples: 2.81.3
globalSecurity: 2.83.7
globalSecurityCallbacks: 0.1.0
globalSecurityFlattening: 0.1.0
globalServerURLs: 2.82.2
nameOverrides: 2.81.2
nullables: 0.1.0
openEnums: 0.1.0
responseFormat: 0.1.0
retries: 2.82.2
sdkHooks: 0.1.0
serverIDs: 2.81.1
unions: 2.82.9
additionalDependencies: 1.0.0
constsAndDefaults: 1.0.0
core: 5.2.4
defaultEnabledRetries: 0.2.0
envVarSecurityUsage: 0.2.0
examples: 3.0.0
globalSecurity: 3.0.0
globalSecurityCallbacks: 1.0.0
globalSecurityFlattening: 1.0.0
globalServerURLs: 3.0.0
multipartFileContentType: 1.0.0
nameOverrides: 3.0.0
nullables: 1.0.0
openEnums: 1.0.0
responseFormat: 1.0.0
retries: 3.0.0
sdkHooks: 1.0.0
serverIDs: 3.0.0
unions: 3.0.1
uploadStreams: 1.0.0
generatedFiles:
- src/unstructured_client/sdkconfiguration.py
- src/unstructured_client/general.py
- src/unstructured_client/sdk.py
- .vscode/settings.json
- py.typed
- pylintrc
- pyproject.toml
- scripts/publish.sh
- setup.py
- src/unstructured_client/__init__.py
- src/unstructured_client/basesdk.py
- src/unstructured_client/httpclient.py
- src/unstructured_client/py.typed
- src/unstructured_client/types/__init__.py
- src/unstructured_client/types/basemodel.py
- src/unstructured_client/utils/__init__.py
- src/unstructured_client/utils/annotations.py
- src/unstructured_client/utils/enums.py
- src/unstructured_client/utils/eventstreaming.py
- src/unstructured_client/utils/forms.py
- src/unstructured_client/utils/headers.py
- src/unstructured_client/utils/metadata.py
- src/unstructured_client/utils/queryparams.py
- src/unstructured_client/utils/requestbodies.py
- src/unstructured_client/utils/retries.py
- src/unstructured_client/utils/utils.py
- src/unstructured_client/utils/security.py
- src/unstructured_client/utils/serializers.py
- src/unstructured_client/utils/url.py
- src/unstructured_client/utils/values.py
- src/unstructured_client/models/errors/sdkerror.py
- src/unstructured_client/models/operations/partition.py
- src/unstructured_client/models/operations/__init__.py
- src/unstructured_client/models/errors/httpvalidationerror.py
- src/unstructured_client/models/errors/servererror.py
- src/unstructured_client/models/errors/__init__.py
- src/unstructured_client/models/shared/validationerror.py
- src/unstructured_client/models/shared/partition_parameters.py
- src/unstructured_client/models/shared/security.py
- src/unstructured_client/models/__init__.py
- src/unstructured_client/models/errors/__init__.py
- src/unstructured_client/models/operations/__init__.py
- src/unstructured_client/models/shared/__init__.py
- docs/models/operations/partitionrequest.md
- docs/models/operations/partitionresponse.md
Expand Down
4 changes: 4 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,10 @@ client-generate-local:
speakeasy overlay apply -s ./openapi.json -o ./overlay_client.yaml > ./openapi_client.json
speakeasy generate sdk -s ./openapi_client.json -o ./ -l python

.PHONY: patch-custom-code
patch-custom-code:
git apply _custom_code.patch

.PHONY: publish
publish:
./scripts/publish.sh
Expand Down
187 changes: 151 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,15 @@ Please refer to the [Unstructured docs](https://docs.unstructured.io/api-referen
<!-- Start SDK Installation [installation] -->
## SDK Installation

PIP
```bash
pip install unstructured-client
```

Poetry
```bash
poetry add unstructured-client
```
<!-- End SDK Installation [installation] -->

## SDK Example Usage
Expand Down Expand Up @@ -131,28 +137,30 @@ Some of the endpoints in this SDK support retries. If you use the SDK without an

To change the default retry strategy for a single API call, simply provide a `RetryConfig` object to the call:
```python
import unstructured_client
from unstructured_client.models import operations, shared
from unstructured_client import UnstructuredClient
from unstructured_client.models import shared
from unstructured_client.utils import BackoffStrategy, RetryConfig

s = unstructured_client.UnstructuredClient()
s = UnstructuredClient(
api_key_auth="YOUR_API_KEY",
)


res = s.general.partition(request=operations.PartitionRequest(
partition_parameters=shared.PartitionParameters(
files=shared.Files(
content='0x2cC94b2FEF'.encode(),
file_name='your_file_here',
),
chunking_strategy=shared.ChunkingStrategy.BY_TITLE,
split_pdf_page_range=[
res = s.general.partition(request={
"partition_parameters": {
"files": {
"content": open("<file_path>", "rb"),
"file_name": "your_file_here",
},
"chunking_strategy": shared.ChunkingStrategy.BY_TITLE,
"split_pdf_page_range": [
1,
10,
],
strategy=shared.Strategy.HI_RES,
),
),
RetryConfig('backoff', BackoffStrategy(1, 50, 1.1, 100), False))
"strategy": shared.Strategy.HI_RES,
},
},
RetryConfig("backoff", BackoffStrategy(1, 50, 1.1, 100), False))

if res.elements is not None:
# handle response
Expand All @@ -162,29 +170,30 @@ if res.elements is not None:

If you'd like to override the default retry strategy for all operations that support retries, you can use the `retry_config` optional parameter when initializing the SDK:
```python
import unstructured_client
from unstructured_client.models import operations, shared
from unstructured_client import UnstructuredClient
from unstructured_client.models import shared
from unstructured_client.utils import BackoffStrategy, RetryConfig

s = unstructured_client.UnstructuredClient(
retry_config=RetryConfig('backoff', BackoffStrategy(1, 50, 1.1, 100), False),
s = UnstructuredClient(
retry_config=RetryConfig("backoff", BackoffStrategy(1, 50, 1.1, 100), False),
api_key_auth="YOUR_API_KEY",
)


res = s.general.partition(request=operations.PartitionRequest(
partition_parameters=shared.PartitionParameters(
files=shared.Files(
content='0x2cC94b2FEF'.encode(),
file_name='your_file_here',
),
chunking_strategy=shared.ChunkingStrategy.BY_TITLE,
split_pdf_page_range=[
res = s.general.partition(request={
"partition_parameters": {
"files": {
"content": open("<file_path>", "rb"),
"file_name": "your_file_here",
},
"chunking_strategy": shared.ChunkingStrategy.BY_TITLE,
"split_pdf_page_range": [
1,
10,
],
strategy=shared.Strategy.HI_RES,
),
))
"strategy": shared.Strategy.HI_RES,
},
})

if res.elements is not None:
# handle response
Expand All @@ -196,16 +205,81 @@ if res.elements is not None:
<!-- Start Custom HTTP Client [http-client] -->
## Custom HTTP Client

The Python SDK makes API calls using the [requests](https://pypi.org/project/requests/) HTTP library. In order to provide a convenient way to configure timeouts, cookies, proxies, custom headers, and other low-level configuration, you can initialize the SDK client with a custom `requests.Session` object.
The Python SDK makes API calls using the [httpx](https://www.python-httpx.org/) HTTP library. In order to provide a convenient way to configure timeouts, cookies, proxies, custom headers, and other low-level configuration, you can initialize the SDK client with your own HTTP client instance.
Depending on whether you are using the sync or async version of the SDK, you can pass an instance of `HttpClient` or `AsyncHttpClient` respectively, which are Protocol's ensuring that the client has the necessary methods to make API calls.
This allows you to wrap the client with your own custom logic, such as adding custom headers, logging, or error handling, or you can just pass an instance of `httpx.Client` or `httpx.AsyncClient` directly.

For example, you could specify a header for every request that this sdk makes as follows:
```python
import unstructured_client
import requests
from unstructured_client import UnstructuredClient
import httpx

http_client = httpx.Client(headers={"x-custom-header": "someValue"})
s = UnstructuredClient(client=http_client)
```

http_client = requests.Session()
http_client.headers.update({'x-custom-header': 'someValue'})
s = unstructured_client.UnstructuredClient(client=http_client)
or you could wrap the client with your own custom logic:
```python
from unstructured_client import UnstructuredClient
from unstructured_client.httpclient import AsyncHttpClient
import httpx

class CustomClient(AsyncHttpClient):
client: AsyncHttpClient

def __init__(self, client: AsyncHttpClient):
self.client = client

async def send(
self,
request: httpx.Request,
*,
stream: bool = False,
auth: Union[
httpx._types.AuthTypes, httpx._client.UseClientDefault, None
] = httpx.USE_CLIENT_DEFAULT,
follow_redirects: Union[
bool, httpx._client.UseClientDefault
] = httpx.USE_CLIENT_DEFAULT,
) -> httpx.Response:
request.headers["Client-Level-Header"] = "added by client"

return await self.client.send(
request, stream=stream, auth=auth, follow_redirects=follow_redirects
)

def build_request(
self,
method: str,
url: httpx._types.URLTypes,
*,
content: Optional[httpx._types.RequestContent] = None,
data: Optional[httpx._types.RequestData] = None,
files: Optional[httpx._types.RequestFiles] = None,
json: Optional[Any] = None,
params: Optional[httpx._types.QueryParamTypes] = None,
headers: Optional[httpx._types.HeaderTypes] = None,
cookies: Optional[httpx._types.CookieTypes] = None,
timeout: Union[
httpx._types.TimeoutTypes, httpx._client.UseClientDefault
] = httpx.USE_CLIENT_DEFAULT,
extensions: Optional[httpx._types.RequestExtensions] = None,
) -> httpx.Request:
return self.client.build_request(
method,
url,
content=content,
data=data,
files=files,
json=json,
params=params,
headers=headers,
cookies=cookies,
timeout=timeout,
extensions=extensions,
)

s = UnstructuredClient(async_client=CustomClient(httpx.AsyncClient()))
```
<!-- End Custom HTTP Client [http-client] -->

Expand All @@ -216,6 +290,47 @@ s = unstructured_client.UnstructuredClient(client=http_client)
<!-- No Server Selection -->
<!-- No Authentication -->

<!-- Start File uploads [file-upload] -->
## File uploads

Certain SDK methods accept file objects as part of a request body or multi-part request. It is possible and typically recommended to upload files as a stream rather than reading the entire contents into memory. This avoids excessive memory consumption and potentially crashing with out-of-memory errors when working with very large files. The following example demonstrates how to attach a file stream to a request.

> [!TIP]
>
> For endpoints that handle file uploads bytes arrays can also be used. However, using streams is recommended for large files.
>

```python
from unstructured_client import UnstructuredClient
from unstructured_client.models import shared

s = UnstructuredClient(
api_key_auth="YOUR_API_KEY",
)


res = s.general.partition(request={
"partition_parameters": {
"files": {
"content": open("<file_path>", "rb"),
"file_name": "your_file_here",
},
"chunking_strategy": shared.ChunkingStrategy.BY_TITLE,
"split_pdf_page_range": [
1,
10,
],
"strategy": shared.Strategy.HI_RES,
},
})

if res.elements is not None:
# handle response
pass

```
<!-- End File uploads [file-upload] -->

<!-- Placeholder for Future Speakeasy SDK Sections -->

### Maturity
Expand Down
Loading
Loading