Skip to content

Introduce API Redesign #763

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 29 commits into from
Closed

Introduce API Redesign #763

wants to merge 29 commits into from

Conversation

bigmontz
Copy link
Contributor

@bigmontz bigmontz commented Jul 20, 2022

  • Query API
  • Execute API
  • Automatic Access Mode
  • Async
  • Docs

@bigmontz bigmontz force-pushed the 5.0-api-redisign branch 2 times, most recently from 676ddf3 to 14bf176 Compare July 20, 2022 12:23
@bigmontz
Copy link
Contributor Author

bigmontz commented Jul 21, 2022

Usage example:

from importlib.metadata import metadata
import neo4j
import asyncio

from neo4j.api import CLUSTER_READERS_ACCESS, CLUSTER_WRITERS_ACCESS


async def main():
    driver = neo4j.AsyncGraphDatabase.driver(
        'bolt://localhost', auth=neo4j.basic_auth("neo4j", "pass"))

    async with driver.session() as session:
        async def txfunc(tx):
            query_result = await tx.query('RETURN $n as n', n=1)
            print(query_result.records)
            print(query_result.summary.database)

        await session.execute(txfunc)

    async with driver.session() as session:
        async def txfunc(tx):
            records, summary = await tx.query(
                'RETURN $n as n',
                n=2)
            print(records)
            print(summary.database)

        await session.execute(txfunc)

    async with driver.session() as session:
        records, summary = await session.query(
                'RETURN $n as n',
                n=3,
                skip_records=True,
                cluster_member_access=CLUSTER_WRITERS_ACCESS)
        print(records)
        print(summary.database)

    async with driver.session() as session:
        records, summary = await session.query(
                'RETURN $n as n',
                n=4)
        print(records)
        print(summary.database)

    async def txfunc(tx, n):
        records, summary = await tx.query(
                'RETURN $n as n',
                n=n)
        print(records)
        print(summary.database)

    # configuration are optional
    await driver.execute(
        lambda tx: txfunc(tx, n=5),
        database="neo4j",
        cluster_member_access=CLUSTER_WRITERS_ACCESS,
        timeout=10,
        metadata={"service": "python driver playground"}
    )

    records, summary = await driver.query(
      'RETURN $n as n',
      n=6,
      cluster_member_access=CLUSTER_READERS_ACCESS)

    print(records)
    print(summary.database)

    await driver.close()


asyncio.run(main())

@@ -348,6 +348,9 @@ def acquire(
self.address, deadline, liveness_check_timeout
)

def is_direct(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this method.

  • I think the base Pool class should have it too and should either raise NotImplementedError or become a abc.ABC (abstract base class).
  • There are already places in the driver that check whether the pool is direct or not (currently using isinstance(pool, Neo4jPool) or similar) . They could also be adopted to use this method.
  • It could also be an attribute just because the lookup might be ever so slightly faster than a method call. But that might be a bit much. Yeah whatever, that one was just a quick thought.

Comment on lines 289 to 291
async def execute(self, transaction_function, *args,
cluster_member_access=CLUSTER_AUTO_ACCESS,
**kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I don't like this design. Here is the problem: if you write a wrapper function that does not control the signature of transaction_function, you are screwed, because you can't make it right in the general case because transaction_function could have a keyword argument cluster_member_access. What now?

This could be addressed like so

    async def execute(
        self, transaction_function,
        transaction_function_args=None, transaction_function_kwargs=None,
        cluster_member_access=CLUSTER_AUTO_ACCESS
    ):

which is admittedly less nice to use. There is the third option

    async def execute(
        self, transaction_function, *args,
        transaction_function_args=None, transaction_function_kwargs=None,
        cluster_member_access=CLUSTER_AUTO_ACCESS, **kwargs
    ):

But I'm not sure if that's not just more confusing than it's helping. I really don't know. I think this is the major downside of cramming all our API into easy to access functions... They become bloated, more complicated, and have to do many things at once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cluster_member_access was already being not passed to the tx function in the previous version and it is kind of useless in that context, since none of the tx methods accept this argument.

I've tried to keep usage quite simple while keep the usage quite similar to the previous tx function. The difference is now the param which will be removed is documented.

An alternative is removing the extra arguments (*args and *kwargs), this way the tx function will only receive the transaction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. Isn't transaction_function a function defined by user-level code (outside of the driver)? How can you know it doesn't have a cluster_member_access parameter?

Copy link
Contributor Author

@bigmontz bigmontz Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know. The Session.run also have the same problem, the user couldn't have a parameter called parameters (if they set the param like session.run('return n, parameters', n=1, parameters='a')).

We could also always add this name params to the tx_func kwargs. Which is kind of the approach used by to extract metadata and timeout in read_transaction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is true about the session.run, if I was to write a wrapper, I'd just never call session.run(query, **parameters) but instead session.run(query, parameters=parameters). That way, I can be sure that my wrapper can cope with any query parameters thrown at it.

We could also always add this name params to the tx_func kwargs. Which is kind of the approach used by to extract metadata and timeout in read_transaction.

I can't follow, sorry.

Copy link
Contributor Author

@bigmontz bigmontz Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_transaction and write_transaction get timeout and metadata from the **kwargs without remove it from it. We could do it with the cluster_member_access (and database, in the driver.execute case) by put it back to the **kwargs. However, we will have the issue with the documentation of the params.

For solving this, I'm suggesting keep the signature like:

async def execute(
        self, transaction_function, *args,
        cluster_member_access=None, **kwargs
    ):

Then, in the execute implementation do something like:

# user doesn't set anything
if cluster_member_access is None:
    cluster_member_access = CLUSTER_AUTO_ACCESS
# user set something
else:
    kwargs['cluster_member_access'] = cluster_member_access

The problem is: we don't know if the user set named param cluster_member_access or not, we only can guess if they are using the default value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still another issue: the user could use a different meaning and dataset for the cluster_member_access variable.

@robsdedude
Copy link
Member

ADR (internal link) was abandoned.

driver.execute_query (#833) is it's spiritual successor.

@robsdedude robsdedude closed this Nov 1, 2022
@robsdedude robsdedude deleted the 5.0-api-redisign branch November 1, 2022 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants