Skip to content

Logical replication support #42

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jun 1, 2018
Merged

Conversation

zilder
Copy link
Collaborator

@zilder zilder commented Mar 19, 2018

Added publication/subscription feature. Typical use case:

node1 = get_new_node().init(allow_logical=True).start()
node2 = get_new_node().init().start()
node1.safe_psql('create table test (a int)')
node2.safe_psql('create table test (a int)')
pub = node1.publish('mypub')
sub = node2.subscribe(pub, 'mysub')
node1.safe_psql('insert into test values (1), (2), (3)')
sub.catchup()
node2.execute('select * from test')
> [(1,), (2,), (3,)]

@codecov-io
Copy link

codecov-io commented Mar 19, 2018

Codecov Report

Merging #42 into master will decrease coverage by 0.01%.
The diff coverage is 97.84%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #42      +/-   ##
==========================================
- Coverage   97.01%   96.99%   -0.02%     
==========================================
  Files          16       17       +1     
  Lines        1405     1529     +124     
==========================================
+ Hits         1363     1483     +120     
- Misses         42       46       +4
Impacted Files Coverage Δ
testgres/connection.py 95.08% <100%> (+0.08%) ⬆️
tests/test_simple.py 99.42% <100%> (-0.36%) ⬇️
testgres/consts.py 100% <100%> (ø) ⬆️
testgres/utils.py 95.37% <100%> (+0.13%) ⬆️
testgres/node.py 96.93% <94.73%> (+0.06%) ⬆️
testgres/pubsub.py 95.91% <95.91%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 221df4f...50e02ff. Read the comment docs.

@funbringer funbringer requested review from ildus and funbringer March 19, 2018 15:48
Copy link
Collaborator

@funbringer funbringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good, but there's a few things that definitely could be improved.

testgres/node.py Outdated
@@ -836,7 +847,8 @@ def poll_query_until(self,
expected=True,
commit=True,
raise_programming_error=True,
raise_internal_error=True):
raise_internal_error=True,
zero_rows_is_ok=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't zero_rows_is_ok=True effectively equal to expected=None?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Subscription.catchup() we are expecting True and at the same time it may happen that there is no rows at the moment (until statistics collected).

testgres/node.py Outdated
@@ -987,6 +1004,41 @@ def catchup(self, dbname=None, username=None):
except Exception as e:
raise_from(CatchUpException("Failed to catch up", poll_lsn), e)

def publish(self,
pubname,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change pubname to name?

testgres/node.py Outdated
dbname: database name where objects or interest are located
username: replication username
"""
return Publication(pubname, self, tables, dbname, username)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please use keyword args instead of positional args?


# connection info
conninfo = (
u"dbname={} user={} host={} port={}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could extract common conninfo-related code from _create_recovery_conf() into a function and add it to utils.py. What do you think?

query=query,
dbname=self.pub.dbname,
username=username or self.pub.username,
max_attempts=60,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it's better to replace max_attempts=60 and zero_rows_is_ok=True with kwargs.

dbname=dbname,
username=username)

def close(self, dbname=None, username=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be called drop instead?

@zilder zilder mentioned this pull request Mar 20, 2018
@@ -462,6 +532,10 @@ def test_poll_query_until(self):
node.poll_query_until(
query='create table def()', expected=None) # returns nothing

# check 0 rows equivalent to expected=None
node.poll_query_until(
query='select * from pg_class where true = false', expected=None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pg_class should be schema-qualified.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

# create publication in database
t = "table " + ", ".join(tables) if tables else "all tables"
query = "create publication {} for {}"
node.safe_psql(query.format(name, t), dbname=dbname, username=username)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All occurrences of safe_psql should be replaced with execute for better performance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. However I had to refactor PostgresNode.execute() method a little bit. Some statements (such as CREATE/DROP SUBSCRIPTION) require non-transactional environment but execute would start transaction automatically. So I used autocommit option which allows to run that kind of queries.


class Subscription(object):
def __init__(self,
name,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to place name right after the mandatory args. Chances are we'll come up with some name generation facility, but we won't be able to make name optional without moving it, which is something I'd like to avoid.

constructing subscription objects.

Args:
name: subscription name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also noticed that some doc strings end with commas while others don't.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

query=query,
dbname=self.pub.dbname,
username=username or self.pub.username,
max_attempts=60)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I don't like the hard-coded number.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be a parameter here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, added LOGICAL_REPL_MAX_CATCHUP_ATTEMPTS parameter

After that :meth:`~.PostgresNode.publish()` and :meth:`~.PostgresNode.subscribe()`
methods may be used to setup replication. Example:

>>> from .api import get_new_node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be changed to from testgres import get_new_node. Documentation should contain working code, even it would require remove doctest.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

testgres/api.py Outdated
@@ -28,7 +28,7 @@
... replica.catchup() # wait until changes are visible
... print(replica.execute('postgres', 'select count(*) from test'))
PostgresNode(name='...', port=..., base_dir='...')
[(3,)]
[(3L,)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was when I tested with doctest. But now when I ran doctest it showed [(3,)] so I am confused :) Will change it back then as it shouldn't be part of this pull request anyway.

@@ -413,6 +417,7 @@ def default_conf(self,
fsync=False,
unix_sockets=True,
allow_streaming=True,
allow_logical=False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we enable this by default?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is not supported on postgres versions below 10 and there is specific message when someone's trying to enable this feature on those versions. Besides it produces extra WAL data and hence could work slightly slower.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, i see.

"""
Drop publication
"""
self.node.safe_psql(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we replace safe_psql with execute?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I overlooked it

# check 0 columns
with self.assertRaises(QueryException):
node.poll_query_until(query='select from pg_class limit 1')
node.poll_query_until(
query='select from pg_catalog.pg_class limit 1')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@funbringer
Copy link
Collaborator

The patch looks good!

@funbringer funbringer merged commit 0583873 into postgrespro:master Jun 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants