Skip to content

Expired session clean-up job causes deadlocks #838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
MichoFelipe opened this issue Jul 25, 2017 · 17 comments
Open

Expired session clean-up job causes deadlocks #838

MichoFelipe opened this issue Jul 25, 2017 · 17 comments
Assignees
Labels

Comments

@MichoFelipe
Copy link

Hi All,

We are having the following issue in our production environment.

Subject: ERROR: PreparedStatementCallback; SQL [UPDATE SPRING_SESSION SET LAST_ACCESS_TIME = ?, MAX_INACTIVE_INTERVAL = ?, PRINCIPAL_NAME = ? WHERE SESSION_ID = ?]; Deadlock found when trying to get lock; try restarting transaction; nested exception is com.mysq...

Thread's Attributes Found:

DeadlockLoserDataAccessException Message: PreparedStatementCallback; SQL [UPDATE SPRING_SESSION SET LAST_ACCESS_TIME = ?, MAX_INACTIVE_INTERVAL = ?, PRINCIPAL_NAME = ? WHERE SESSION_ID = ?]; Deadlock found when trying to get lock; try restarting transaction; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction

org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:263)
org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:649)
org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:870)
org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:931)
org.springframework.session.jdbc.JdbcOperationsSessionRepository$2.doInTransactionWithoutResult(JdbcOperationsSessionRepository.java:318)
org.springframework.transaction.support.TransactionCallbackWithoutResult.doInTransaction(TransactionCallbackWithoutResult.java:34)
org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133)
org.springframework.session.jdbc.JdbcOperationsSessionRepository.save(JdbcOperationsSessionRepository.java:314)
org.springframework.session.jdbc.JdbcOperationsSessionRepository.save(JdbcOperationsSessionRepository.java:130)
org.springframework.session.web.http.SessionRepositoryFilter$SessionRepositoryRequestWrapper.commitSession(SessionRepositoryFilter.java:244)
org.springframework.session.web.http.SessionRepositoryFilter$SessionRepositoryRequestWrapper.access$100(SessionRepositoryFilter.java:214)
org.springframework.session.web.http.SessionRepositoryFilter$SessionRepositoryResponseWrapper.onResponseCommitted(SessionRepositoryFilter.java:202)
org.springframework.session.web.http.OnCommittedResponseWrapper.doOnResponseCommitted(OnCommittedResponseWrapper.java:226)
org.springframework.session.web.http.OnCommittedResponseWrapper.checkContentLength(OnCommittedResponseWrapper.java:216)
org.springframework.session.web.http.OnCommittedResponseWrapper.access$200(OnCommittedResponseWrapper.java:37)
org.springframework.session.web.http.OnCommittedResponseWrapper$SaveContextServletOutputStream.write(OnCommittedResponseWrapper.java:562)
com.fasterxml.jackson.core.json.UTF8JsonGenerator._flushBuffer(UTF8JsonGenerator.java:2033)
com.fasterxml.jackson.core.json.UTF8JsonGenerator.flush(UTF8JsonGenerator.java:1045)
com.fasterxml.jackson.databind.ObjectWriter.writeValue(ObjectWriter.java:942)
org.springframework.http.converter.json.AbstractJackson2HttpMessageConverter.writeInternal(AbstractJackson2HttpMessageConverter.java:267)
org.springframework.http.converter.AbstractGenericHttpMessageConverter.write(AbstractGenericHttpMessageConverter.java:100)
org.springframework.web.servlet.mvc.method.annotation.AbstractMessageConverterMethodProcessor.writeWithMessageConverters(AbstractMessageConverterMethodProcessor.java:232)
org.springframework.web.servlet.mvc.method.annotation.RequestResponseBodyMethodProcessor.handleReturnValue(RequestResponseBodyMethodProcessor.java:173)
org.springframework.web.method.support.HandlerMethodReturnValueHandlerComposite.handleReturnValue(HandlerMethodReturnValueHandlerComposite.java:81)
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:130)
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:963)
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:897)
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)
org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:861)
javax.servlet.http.HttpServlet.service(HttpServlet.java:622)
org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)
javax.servlet.http.HttpServlet.service(HttpServlet.java:729)

We don't get the error in our local/DEV/QA/Non-Prod environment just in prod. env. We're replicating the data across nodes there. Could be that the reason?

Furthermore, we think that it's related to session expired.

Please, we appreciate your support :)

Thanks!

Michael

@vpavic vpavic self-assigned this Jul 26, 2017
@vpavic vpavic added the for: stack-overflow A question that's better suited to stackoverflow.com label Jul 26, 2017
@vpavic
Copy link
Contributor

vpavic commented Jul 26, 2017

Hi @MichoFelipe - can you provide more details about your environment? I assume you are using MySQL in a multi-master setup, such as Galera?

From my experience such setup can be quite problematic in high workloads when dealing with many writes (which are common with session stores).

Also this looks like a duplicate of #676.

@vpavic vpavic marked this as a duplicate of #676 Jul 26, 2017
@vpavic vpavic added the status: waiting-for-feedback We need additional information before we can continue label Jul 26, 2017
@MichoFelipe
Copy link
Author

Hi vpavic, yes We are using MySQL with Percona for data replication. The problem doesn't look to be due to the workload, as there wasn't much traffic when we got the error, seems to be related to the session expiration.
could you please explain us how the session expiration works?

@vpavic
Copy link
Contributor

vpavic commented Jul 27, 2017

The expirations are handled by JdbcOperationsSessionRepository#cleanUpExpiredSessions:

@Scheduled(cron = "${spring.session.cleanup.cron.expression:0 * * * * *}")
public void cleanUpExpiredSessions() {
	int deletedCount = this.transactionOperations.execute(transactionStatus ->
			JdbcOperationsSessionRepository.this.jdbcOperations.update(
					JdbcOperationsSessionRepository.this.deleteSessionsByLastAccessTimeQuery,
					System.currentTimeMillis()));

	if (logger.isDebugEnabled()) {
		logger.debug("Cleaned up " + deletedCount + " expired sessions");
	}
}

This is a @Scheduled operation which runs minutely by default and uses the following SQL statement to delete expired sessions:

DELETE FROM %TABLE_NAME%
WHERE MAX_INACTIVE_INTERVAL < (? - LAST_ACCESS_TIME) / 1000;

@vpavic vpavic removed the status: waiting-for-feedback We need additional information before we can continue label Sep 13, 2017
@vpavic
Copy link
Contributor

vpavic commented Sep 13, 2017

Closing as answered. If you have additional questions or feel that your original question isn't properly answered, please re-open the issue.

As a side note, we've recently introduced some optimizations of session cleanup SQL statement which might have a positive impact on reducing the possibility of deadlocks in multi-master setup like yours. See #847 and #872 for more details.

@vpavic vpavic closed this as completed Sep 13, 2017
@vanekat
Copy link

vanekat commented Sep 13, 2017

Hi @vpavic, how is that scheduled operation supposed to work on a cluster environment?
In our case we have deployed our application in 4 instances. And it looks to me that we are getting that error because the scheduled operation will be executed 4 times each minute, so as they could be running simultaneously if one instance is trying to update a record and another one is trying to delete the same record that could be causing the issue.
Reviewing the db logs the dba team identified that the update and delete statements keep following each other, showing that the application is working on the same record and hence the deadlock error is generated.

@vpavic
Copy link
Contributor

vpavic commented Sep 14, 2017

@vanekat Take a look at this comment and the @Scheduled annotation in particular - you can see that the cron expression is parameterized and can be customized using spring.session.cleanup.cron.expression property.

@vpavic
Copy link
Contributor

vpavic commented Oct 26, 2017

@MichoFelipe @vanekat It might be of your interest that there are proposed changes to Spring Session JDBC configuration too add first-class support for configuring cleanup cron via @EnableJdbcHttpSession. You can track that effort in #907.

@rochaalexandre
Copy link

Hi @vpavic, how is that scheduled operation supposed to work on a cluster environment?
In our case we have deployed our application in 4 instances. And it looks to me that we are getting that error because the scheduled operation will be executed 4 times each minute, so as they could be running simultaneously if one instance is trying to update a record and another one is trying to delete the same record that could be causing the issue.
Reviewing the db logs the dba team identified that the update and delete statements keep following each other, showing that the application is working on the same record and hence the deadlock error is generated.

Guys about this problem when running the clean job for expired session, we have solved in our project modifing the query DELETE_SESSIONS_BY_EXPIRY_TIME_QUERY from :
"DELETE FROM %TABLE_NAME% "WHERE EXPIRY_TIME < ?
to use this logic
"DELETE FROM GDIR.SPRING_SESSION s " +
" WHERE s.PRIMARY_ID IN " +
" (" +
" select PRIMARY_ID " +
" FROM " +
" (" +
" select ss.PRIMARY_ID as PRIMARY_ID from GDIR.SPRING_SESSION ss " +
" WHERE ss.EXPIRY_TIME < ? " +
" ORDER BY ss.PRIMARY_ID " +
" ) a" +
" )"``.
This follows a suggestion to the DB team, that using and order by clause in the delete will make the deletes follow the same order from multiple sessions.

@vpavic
Copy link
Contributor

vpavic commented Nov 14, 2020

While working on another issue, I've managed to reproduce this in an integration test, so I'll reopen this and classify it as a bug.

Unfortunately, the same test still fails if I apply the statement suggested in the previous comment. @alexandrececchin, can you confirm this solved the problem for you?

@vpavic vpavic reopened this Nov 14, 2020
@vpavic vpavic added in: jdbc type: bug A general bug and removed for: stack-overflow A question that's better suited to stackoverflow.com labels Nov 14, 2020
@vpavic vpavic added this to the 2.x milestone Nov 14, 2020
vpavic added a commit to vpavic/spring-session that referenced this issue Nov 14, 2020
vpavic added a commit to vpavic/spring-session that referenced this issue Nov 14, 2020
@vpavic
Copy link
Contributor

vpavic commented Nov 14, 2020

For completeness and discoverability, here are the issues that have been marked as duplicate of this one:

And to some degree also #1550.

Here's the branch (on my fork) containing the reproducer:

@rochaalexandre
Copy link

While working on another issue, I've managed to reproduce this in an integration test, so I'll reopen this and classify it as a bug.

Unfortunately, the same test still fails if I apply the statement suggested in the previous comment. @alexandrececchin, can you confirm this solved the problem for you?

Hi @vpavic,
It seems to have worked in the beginning. But after some time, we started to see the deadlock error again in our app logs. We still looking for a good solution to this. Maybe applying the distributed lock to this job is a good approach.

@vpavic
Copy link
Contributor

vpavic commented Nov 30, 2020

Hi @alexandrececchin, thanks for the confirmation.

I've experimented a bit with different approaches the last couple of days, but unfortunately haven't found a way to address this problem without making significant changes to JDBC repository implementation.

At this point, for anyone facing this issue, I'd suggest to disable the Spring Session's expired sessions clean-up job altogether (see ScheduledTaskRegistrar#CRON_DISABLED) and apply a workaround along these lines:

public void cleanUpExpiredSessions() {
	List<String> expiredSessionIds = this.jdbcOperations.queryForList(
			"SELECT SESSION_ID FROM SPRING_SESSION WHERE EXPIRY_TIME < ? LIMIT 200", String.class,
			System.currentTimeMillis());
	for (String expiredSessionId : expiredSessionIds) {
		deleteById(expiredSessionId);
	}
}

The deleteById method referenced here is SessionRepository#deleteById, so this should be a standalone component that operates on the public SessionRepository API.

Few things to consider:

  • this approach will result in n+1 SQL statements executed, where n is the number of expired sessions, however since this work will not be executed by a HTTP request serving thread but rather a background one, it shouldn't be an issue
  • the limit of fetched expired session ids should be set taking into consideration the usual peak number of expired sessions system observers within some period of time, as well as the desired frequency of the clean-up job execution
  • in a cluster of multiple running instances, the clean-up job should run only on one node at a time

I'll open an issue to consider doing a more substantial makeover of the session clean-up job.

@vpavic vpavic changed the title Deadlock found when trying to get lock with SQL [UPDATE SPRING_SESSION SET LAST_ACCESS_TIME = ?, MAX_INACTIVE_INTERVAL = ?, PRINCIPAL_NAME = ? WHERE SESSION_ID = ?]; Expired session clean-up job causes deadlocks Dec 1, 2020
@vpavic
Copy link
Contributor

vpavic commented Dec 1, 2020

I'll open an issue to consider doing a more substantial makeover of the session clean-up job.

For reference, I've created #1739.

@szbenceg
Copy link

Hi @alexandrececchin, thanks for the confirmation.

I've experimented a bit with different approaches the last couple of days, but unfortunately haven't found a way to address this problem without making significant changes to JDBC repository implementation.

At this point, for anyone facing this issue, I'd suggest to disable the Spring Session's expired sessions clean-up job altogether (see ScheduledTaskRegistrar#CRON_DISABLED) and apply a workaround along these lines:

public void cleanUpExpiredSessions() {
	List<String> expiredSessionIds = this.jdbcOperations.queryForList(
			"SELECT SESSION_ID FROM SPRING_SESSION WHERE EXPIRY_TIME < ? LIMIT 200", String.class,
			System.currentTimeMillis());
	for (String expiredSessionId : expiredSessionIds) {
		deleteById(expiredSessionId);
	}
}

The deleteById method referenced here is SessionRepository#deleteById, so this should be a standalone component that operates on the public SessionRepository API.

Few things to consider:

  • this approach will result in n+1 SQL statements executed, where n is the number of expired sessions, however since this work will not be executed by a HTTP request serving thread but rather a background one, it shouldn't be an issue
  • the limit of fetched expired session ids should be set taking into consideration the usual peak number of expired sessions system observers within some period of time, as well as the desired frequency of the clean-up job execution
  • in a cluster of multiple running instances, the clean-up job should run only on one node at a time

I'll open an issue to consider doing a more substantial makeover of the session clean-up job.

Can you explay why is it necessary to write a custom sql query? Or is it enough if only just one node deletes the session, and the others are turned off.

Can that also couse a problem if only just one node uses the built in scheduled job, or why did you write the custom query?

I ask because we face with the same deadlock issue mentioned above with Galera. Thank you for your reply in advance.

@vpavic
Copy link
Contributor

vpavic commented Aug 24, 2022

IIRC, according to different user reports we had involving deadlocks, it's not necessary to have more than one instance running bulk delete statement.

Can that also couse a problem if only just one node uses the built in scheduled job, or why did you write the custom query?

So the answer to this would be yes. But it really depends on your workload.

Or is it enough if only just one node deletes the session, and the others are turned off.

I would advise against this. Setting cron expressions to different values on each node is what we typically recommend.

@rwinch rwinch removed this from the 2.x milestone Nov 15, 2022
@blacelle
Copy link

Is spring.session.cleanup.cron.expression SpringBoot property still a way to change the default cron?

Could the default cron be adjusted to clean session at a random minute of each hour?

@kidotaka
Copy link

I have an idea, but not tested.

How about the following SQL?

WITH locked_sessions AS (
  SELECT PRIMARY_ID FROM %TABLE_NAME%
  WHERE EXPIRY_TIME < ?
  FOR UPDATE SKIP LOCKED
)
DELETE FROM %TABLE_NAME%
WHERE PRIMARY_ID IN (SELECT PRIMARY_ID FROM locked_sessions)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants