-
Notifications
You must be signed in to change notification settings - Fork 3k
Remove spylon-kernel from all images. #1729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
|
@mathbunnyru When this PR is merged I will start working on a new PR to upgrade to spark 3.3.0 |
|
In general, this LGTM 👍 When we get rid of leftovers, I would probably like to squash-merge this, to make it easier to see what needs to be added back in the future (if another kernel will be added). |
|
Thank you for your work to maintain these parts, and thank you for making a dedicated PR for something like this - that is great for visibility I think, nice! ❤️ 🎉 I'm not confident on the situation overall, entirely lacking experience with spark, but this looks reasonable to me. I suggest:
|
|
When updating to Spark 3.3.0 with Scala 2.13 Test for spylon kernel are failing. [100%] FAILED tests/all-spark-notebook/test_spark_notebooks.py::test_nbconvert[local_spylon] The Ammonite-spark have support for Scala 2.13 It then run Spark by using ivy Toree is another one for Spark and Scala. This got support for java version 8 and 11. We are going to upgrade to 17 now. There are something that tells me that there are something wrong in with the CI I have put this PR in WIP mode so that we can find a proper solution to this problem. |
|
Thanks! Btw, I think maybe avoiding adding something else for scala is the best course of action until its a clear wish from an active user of these images. Adding it before that seems premature. |
|
Yes, but when we upgrade pyspark (python on spark) image, the tests are running for the all-spark-notebook. all-spark-notebook are pyspark, r and scala. Scala is provided thru the not updated spylon-kernel, and this is the problem. Now I don't really wanna remove support for Scala in all images, but it's tested by spark. |
|
@consideRatio Now I have updated the description for this PR. Are there still any questions? |
|
I am little concerned about leaving all the Spark Scala users out in the cold. Hence I would update our documentation to mention Almond or Databricks Thx |
|
@Bidek56 Yes, but at the moment there are no Scala kernels that have support for Spark 3.3.X with Java 17 and Scala 2.13. That's the reason why your PR failed to build now.. You can link to the Almond kernel, but it wont help users that will run the latest Spark version. We can apply this PR which removes Scala. And when the Almond kernel gets support for the latest Spark version, we can make a new PR with Scala support through the Almond kernel. |
It was decided in prior discussions not to include Almond in this repo since it has its own Dockerfile. |
I agree with @consideRatio. When another kernel works with our images, we will try to document it or merge it into our images. Merging this one, thank you for your contribution 👍 |
|
Squash merging this to allow easier reverts if needed. |
Describe your changes
This is part 1 of 2 for updating Apache Spark to 3.3.0 with Scala 2.13 and Java version 17.
This PR will remove the spylon-kernel, which means that there will be no more Scala support.
The Spylon-kernel which was last updated in 2017 and is inactive.
When updating to Spark 3.3.0 with Scala 2.13, tests for the spylon kernel are failing.
[100%] FAILED tests/all-spark-notebook/test_spark_notebooks.py::test_nbconvert[local_spylon]
When we upgrade pyspark (python on spark) image, the tests are running for the all-spark-notebook. The all-spark-notebook are pyspark, R and Scala. Scala is provided thru the not updated spylon-kernel, and this is the reason why we need to remove it.
The only reason why we have the spylon kernel is to have support for notebooks with Scala to use Spark.
However Apache Spark have now voted to make a client The JIRA issue
So for the next Apache Spark 3.4.0 this issue will hopefully be solved by the Spark Client.
Issue ticket if applicable
Upgrading Spark->3.3,Hadoop->3,Scala->2.13,Java->17
Checklist (especially for first-time contributors)