Skip to content

NH-2176 - Add test case #410

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from
Closed

NH-2176 - Add test case #410

wants to merge 7 commits into from

Conversation

hazzik
Copy link
Member

@hazzik hazzik commented Mar 4, 2015

No description provided.

@hazzik
Copy link
Member Author

hazzik commented Apr 14, 2015

Need to cleanup whitespaces

@oskarb oskarb modified the milestones: 5.0.0, 4.1.0 Nov 26, 2016
@hazzik
Copy link
Member Author

hazzik commented Apr 10, 2017

Related to #532

@hazzik
Copy link
Member Author

hazzik commented Apr 11, 2017

@fredericDelaporte I need someone with fresh eyes to look into this

@fredericDelaporte
Copy link
Member

fredericDelaporte commented Apr 14, 2017

I may be completely wrong, but I fear NHibernate tries to support an invalid programming model about transaction scope currently.

  1. The session detects ambiant transaction and enlist with them.
  2. A transaction scope in which a session is enlisted needs the session for completing the transaction (distributed or not).
  3. If the session is closed (ISession.Close()) before transaction scope disposal, the transaction scope will fail.
  4. What I see as a cursed hack causes ISession.Dispose to not actually dispose the session if it is enlisted with an ongoing transaction scope. It just set a flag for transaction completion to cause the session disposal. This allows "disposing" the session although it participates in a an ongoing transaction scope.

It looks to me that point 4 is an attempt at supporting a model that violates IDisposable pattern. We should not try supporting the disposal of a ressource which is still in use.

If we look at how this is handled in EF 6 (doc updated on October 2016), it dodges the trouble in a quite more valid way:

  1. The context to be disposed inside a transaction scope has to be provided with a SQL connection it will not handle.
  2. The context has to be flushed to db before being disposed.

That way, the context is indeed no more needed by the transaction scope for completing the transaction.

Point 1. is important for allowing other contextes nested inside the transaction scope to access the changes done on the data by a previous context while still in the ongoing transaction scope. (Otherwise they have this issue.)

I think we need to do breaking changes for having a more reasonable implementation:

  1. Cease delaying the session disposal on dispose, even when nested in a transaction scope.
  2. Cease throwing exception in connection manager when closing a connection while in a transaction scope: just pass along the call to the connection provider (or de-reference non-owned connection). Do not dispose the transaction context anymore: move it to transaction completion event.
  3. Mandates session Flush before disposal for changes to be persisted. (It can still be implicit if the transaction is committed before the session disposal; alternative: change FlushMode.Auto for auto flushing on dispose if inside a transaction scope, or add a new mode for that.)
  4. Change the enlistment: if the session has been disposed, reply Done on prepare, no-op on session.

Well ideally, we should still perform post-transaction operations required for things like the second level cache, although the session is disposed. I do not know if those operations can be moved outside session responsibility easily or not.

An additional point could be added for an alternate to point 1. of EF case:

  1. Make connection provider transaction scope aware: it will need to enlist too (at least receive transaction completed event), and will give the same connection to sessions executing in the same scope. It will no more close them if in a transaction scope, but will wait for the transaction scope completion.

@fredericDelaporte
Copy link
Member

fredericDelaporte commented Apr 23, 2017

Checking the state of Hibernate, for the JPA case, they do something similar to NHibernate delayed disposal of the session. But for the JDBC and JTA (JTA is what looks closest to TransactionScope) case, they actually close the session, but not the JdbcCoordinator if a shared transaction is ongoing. The connection is handled by the JdbcCoordinator.

I think we should replicate the JDBC handling.

@fredericDelaporte
Copy link
Member

Still more thoughts on the subject: currently on NHibernate side, it looks like people try to use the old entity mode switching capability (GetSession(EntityMode mode)) for sharing the connection among sessions. And when doing NH-3722 as a follow up to HHH-6330, instead of completely removing the functionality as Hibernate has done, we have just removed the entity mode argument and renamed it GetChildSession().

We are quite diverging from Hibernate by doing so. They do not need this because they have decoupled connexion handling from the session.

The more I look to all that, the more I think we should re-align our session/transaction/connection handling with Hibernate.

@hazzik
Copy link
Member Author

hazzik commented Apr 26, 2017

I agree

@hazzik
Copy link
Member Author

hazzik commented May 17, 2017

It seems that session.Connection.EnlistTransaction(System.Transactions.Transaction.Current); at Line 33 of AdoNetWithDistributedTransactionFactory fixes this issue.

@fredericDelaporte
Copy link
Member

It seems that session.Connection.EnlistTransaction(System.Transactions.Transaction.Current); at Line 33 of AdoNetWithDistributedTransactionFactory fixes this issue.

It will very likely fix NH-2176, but it will not fix:

  • NH-2237 - unlocking of entities occurring while next transaction already started: due to TransactionCompleted executing in DTC cases after transaction scope completion in separated and concurrent thread.
  • NH-3968 - connection already prepared during session prepare phase: due to trying to re-use the session connection for flushing from session prepare phase while this connection is itself enlisted for its own 2PC in the same transaction.

Unlikely to fix:

  • NH-2107 - connection not released on rollback. This bug lacks information for me to actually understand how it gets triggered. But I highly suspect this is bounded to the current pattern of delaying session disposal and connection release to transaction completion events.
  • NH-2238 - prepare phase failure. Again lack of informations, but looks bounded to trying to flush from session prepare phase with a connection already enlisted itself for its own 2PC.
  • NH-3023 - deadlock causing connection pool corruption: very likely same root cause than NH-2107.
  • NH-3227 - rollback causing connection pool corruption: almost duplicate of NH-3023.

That is why I consider this change is not enough to fix current approach.

@hazzik
Copy link
Member Author

hazzik commented May 17, 2017

NH-2238 is exactly the same as NH-3968

@fredericDelaporte
Copy link
Member

(Well, the time for me to consolidate my previous comment, it seems you have met NH-3968. A comment notification on #532 indicates that in my mailbox, but I do not see it online.)

@hazzik
Copy link
Member Author

hazzik commented May 17, 2017

but I do not see it online.

image

@hazzik
Copy link
Member Author

hazzik commented May 17, 2017

but I do not see it online.

I deleted it along with the changes it was mentioning

@fredericDelaporte
Copy link
Member

NH-2238 is exactly the same as NH-3968

Having the same root cause, very likely. But the former writes about a closed connection exception while the later is about a timeout trying to use it.

but I do not see it online.

I'd guessed you have deleted it of course, but sometime I prefer tell facts (not seeing it) rather than guesses (being deleted) even when obvious. (And sometime that's Github auto-loading of new comments which does not always trigger when I would have expected.)

Copy link
Member

@fredericDelaporte fredericDelaporte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The custom transaction factory is in a dead end unless we do some code changes in NHibernate.

@@ -19,6 +19,11 @@ public class DtcFailuresFixture : TestCase
{
private static readonly ILog log = LogManager.GetLogger(typeof(DtcFailuresFixture));

protected override void Configure(Configuration configuration)
{
configuration.SetProperty(Cfg.Environment.TransactionStrategy, "NHibernate.Test.NHSpecificTest.NH2176.CustomAdoNetTransactionFactory, NHibernate.Test");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing the custom factory with DTC testes.

if (Dialect is FirebirdDialect)
Assert.Ignore("Firebird driver does not support distributed transactions");
/*if (Dialect is FirebirdDialect)
Assert.Ignore("Firebird driver does not support distributed transactions");*/
Copy link
Member

@fredericDelaporte fredericDelaporte May 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my local test Firebird supports them (at least does not throw), so just letting them run to test with teamcity too.

@@ -113,8 +118,8 @@ public void Can_roll_back_transaction()
[Description("Another action inside the transaction do the rollBack outside nh-session-scope.")]
public void RollbackOutsideNh()
Copy link
Member

@fredericDelaporte fredericDelaporte May 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one showcase the roadblock for the custom transaction factory: session close within an ongoing transaction scope is rejected by connection manager.

The connection manager close does close any pending explicit transaction, then call disconnect which checks there are no pending transaction, explicit or ambient. The ambient check on disconnect should be removed, ambient transaction supports connection disconnect.

This need changes in NHibernate project even for testing: the connection manager is not pluggable.

// This damn thing may yield an invalid operation exception (instead of null
// or of a completed transaction) if we are between scope.Complete() and
// scope.Dispose(). This happen when having completed the scope then disposing
// the session and only after that disposing the scope.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another bug fixed in custom transaction factory: accessing Transaction.Current may throw, need to catch.

@fredericDelaporte
Copy link
Member

fredericDelaporte commented May 17, 2017

Doomed attempt, I have seen locally that the second phase of 2PC can be executed after transaction scope disposal. (Illustrated by NH-2420 test.) I need to do some new edits on NH-4011!

I may still add one more attempt, if I find a way through lock to ensure second phase is done before any new session usage. But it may be a crutch reducing risks but not eliminating them.

using (var s = sessions.OpenSession(connection))
{
Nums nums = null;
Assert.DoesNotThrow(() => nums = s.Load<Nums>(29), "Failed loading entity from second level cache.");
Copy link
Member

@fredericDelaporte fredericDelaporte May 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving an explicit message instead of "caching failure" => then "failing to tear down", and letting not much clues.


using (var s = sessions.OpenSession())
using (var tx = s.BeginTransaction())
finally
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding additional tear down failure.

Copy link
Member

@fredericDelaporte fredericDelaporte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That attempt seems now to work completely with SQL Server, even DTC test that was ignored as "not fixed" being green.

But relying on thread synchronization does look good to me. I would rather introduce the new factories I envision in NH-4011.

Of course, need to see if this works with all db still. And then, need to cleanup previous tests (custom factory, commits, ...)

// here. When having TransactionCompleted event, this event and the second phase
// tend to occur before reaching here. But some other NH cases demonstrate that
// TransactionCompleted may also occur "too late".
((ISessionImplementor) s).TransactionContext?.WaitOne();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test use a session property (IsConnected) that does not sync with Transaction factory. So for the test, I have added an explicit sync call. I was feeling like really required to cause IsConnected to sync.

public IDisposable FlushingFromDtcTransaction
{
get
{
if (ownConnection)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not trying to handle use supplied connection case. We would have to clone it otherwise.

@@ -24,52 +25,36 @@ public ITransaction CreateTransaction(ISessionImplementor session)

public void EnlistInDistributedTransactionIfNeeded(ISessionImplementor session)
{
// Ensure the session does not run on a thread supposed to be blocked, waiting
// for transaction completion.
session.TransactionContext?.WaitOne();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transaction completion may concurrently nullify it, but ?. handles that (correctly evaluate it only once, guaranteeing it will not try to call WaitOne on null).


private readonly ISessionImplementor _sessionImplementor;
private readonly ManualResetEvent _waitEvent = new ManualResetEvent(true);
private readonly AsyncLocal<bool> _bypassWait = new AsyncLocal<bool>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not already needing async but well, better avoid adding new thread locals.

// Dispose releases blocked threads by the way.
// Must dispose in case !ShouldCloseSessionOnDistributedTransactionCompleted, since
// we nullify session TransactionContext, causing it to have nothing still holding it.
Dispose();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was missing and is a transaction leak in NHibernate 4.1. When scopes are disposed inside session life-span, the TransactionContext was nullified on session without being disposed, causing the AmbientTransaction clone to not be disposed of explicitly.

{
_logger.Warn(
"Synchronization failure, assuming it has been concurrently disposed and do not need sync anymore.",
ex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not witnessed this, but it may happen if between the test on _isDisposed and the call to WaitOne, the transaction completion event dispose the wait event.

AmbientTransation = null;
}
_waitEvent.Set();
_waitEvent.Dispose();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my tests, disposing right after signaling does not cause an exception for thread potentially waiting. (I have run all test first without the catch around WaitOne, and do not got any failures, all tests passed.)

// here. When having TransactionCompleted event, this event and the second phase
// tend to occur before reaching here. But some other NH cases demonstrate that
// TransactionCompleted may also occur "too late".
((ISessionImplementor) s).TransactionContext?.WaitOne();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall be s.GetSessionImplementor()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not the first time... Maybe should we remove all the other alike catches which still liter the tests.

void IEnlistmentNotification.Rollback(Enlistment enlistment)
#endregion

public void TransactionCompleted(object sender, TransactionEventArgs e)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TransactionCompleted shall be incorporated to IEnlistmentNotification.Commit/IEnlistmentNotification.Rollback/IEnlistmentNotification.InDoubt

e.Transaction.TransactionCompleted -= TransactionCompleted;
// This event may execute before second phase, so we cannot try to get the success from second phase.
// Using this event is required in case the prepare phase failed and called force rollback: no second
// phase would occur for this ressource.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shall be handled on force rollback then

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation on which conditions the second phase will not be called at all lacks. At first I have put it all in second phase callbacks, just to realize thanks to some tests there were cases for which it was not good.

So the forced rollback from session volatile prepare phase is maybe only one case in which second phase do not occur. For staying on the safe side, we would better keep that handling in TransactionCompleted.

The fact the completion actions were called from in-doubt too made me wonder whether in in-doubt case, the transaction completed call be not called either. There too, for staying on the safe side, I have decided to keep that call. This transaction factory is the result of many years of fiddling with many failing case for trying to support them all. During my own fiddling with it, I have seen that many design decisions in it, lacking comments and which were looking as overly complicating the code, where indeed needed for some reasons. So I had to kept them, but at least adding comments for most of them.

Moreover, handling transaction completion actions from the prepare phase in case of force rollback may cause other issues, since the session connection state at that point is unknown: has it already processed its own prepare phase? Is it processing it concurrently? Raising completion actions there would trigger its release, and I do not know if all connection implementations would support that.

This is also why I consider we should still provide other transaction factories even if we appear to get this one working: there too many points where we still act on the connection with concurrency risks. Providing other transaction factories would allow whoever wishing it to not take those risks, at the cost of some more explicit flushes calls.

@hazzik hazzik closed this Jun 2, 2017
@hazzik hazzik deleted the NH-2176 branch June 2, 2017 10:36
@hazzik hazzik removed this from the 5.0 milestone Aug 3, 2017
@hazzik hazzik added c: Tests and removed tests labels Oct 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants