Skip to content

Commit 4c5efcf

Browse files
committed
Merge pull request #532 from andyk/master
SPARK-715: Adds instructions for building with Maven to documentation
2 parents 3558849 + 446b801 commit 4c5efcf

File tree

4 files changed

+72
-0
lines changed

4 files changed

+72
-0
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@ which is packaged with it. To build Spark and its example programs, run:
1717

1818
sbt/sbt package
1919

20+
Spark also supports building using Maven. If you would like to build using Maven, see the [instructions for building Spark with Maven](http://spark-project.org/docs/latest/building-with-maven.html) in the spark documentation..
21+
2022
To run Spark, you will need to have Scala's bin directory in your `PATH`, or
2123
you will need to set the `SCALA_HOME` environment variable to point to where
2224
you've installed Scala. Scala must be accessible through one of these

docs/_layouts/global.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,7 @@
9090
<li class="dropdown">
9191
<a href="api.html" class="dropdown-toggle" data-toggle="dropdown">More<b class="caret"></b></a>
9292
<ul class="dropdown-menu">
93+
<li><a href="building-with-maven.html">Building Spark with Maven</a></li>
9394
<li><a href="configuration.html">Configuration</a></li>
9495
<li><a href="tuning.html">Tuning Guide</a></li>
9596
<li><a href="bagel-programming-guide.html">Bagel (Pregel on Spark)</a></li>

docs/building-with-maven.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
---
2+
layout: global
3+
title: Building Spark with Maven
4+
---
5+
6+
* This will become a table of contents (this text will be scraped).
7+
{:toc}
8+
9+
Building Spark using Maven Requires Maven 3 (the build process is tested with Maven 3.0.4) and Java 1.6 or newer.
10+
11+
Building with Maven requires that a Hadoop profile be specified explicitly at the command line, there is no default. There are two profiles to choose from, one for building for Hadoop 1 or Hadoop 2.
12+
13+
for Hadoop 1 (using 0.20.205.0) use:
14+
15+
$ mvn -Phadoop1 clean install
16+
17+
18+
for Hadoop 2 (using 2.0.0-mr1-cdh4.1.1) use:
19+
20+
$ mvn -Phadoop2 clean install
21+
22+
It uses the scala-maven-plugin which supports incremental and continuous compilation. E.g.
23+
24+
$ mvn -Phadoop2 scala:cc
25+
26+
…should run continuous compilation (i.e. wait for changes). However, this has not been tested extensively.
27+
28+
## Spark Tests in Maven ##
29+
30+
Tests are run by default via the scalatest-maven-plugin. With this you can do things like:
31+
32+
Skip test execution (but not compilation):
33+
34+
$ mvn -DskipTests -Phadoop2 clean install
35+
36+
To run a specific test suite:
37+
38+
$ mvn -Phadoop2 -Dsuites=spark.repl.ReplSuite test
39+
40+
41+
## Setting up JVM Memory Usage Via Maven ##
42+
43+
You might run into the following errors if you're using a vanilla installation of Maven:
44+
45+
[INFO] Compiling 203 Scala sources and 9 Java sources to /Users/andyk/Development/spark/core/target/scala-2.9.2/classes...
46+
[ERROR] PermGen space -> [Help 1]
47+
48+
[INFO] Compiling 203 Scala sources and 9 Java sources to /Users/andyk/Development/spark/core/target/scala-2.9.2/classes...
49+
[ERROR] Java heap space -> [Help 1]
50+
51+
To fix these, you can do the following:
52+
53+
export MAVEN_OPTS="-Xmx1024m -XX:MaxPermSize=128M"
54+
55+
56+
## Using With IntelliJ IDEA ##
57+
58+
This setup works fine in IntelliJ IDEA 11.1.4. After opening the project via the pom.xml file in the project root folder, you only need to activate either the hadoop1 or hadoop2 profile in the "Maven Properties" popout. We have not tried Eclipse/Scala IDE with this.
59+
60+
## Building Spark Debian Packages ##
61+
62+
It includes support for building a Debian package containing a 'fat-jar' which includes the repl, the examples and bagel. This can be created by specifying the deb profile:
63+
64+
$ mvn -Phadoop2,deb clean install
65+
66+
The debian package can then be found under repl/target. We added the short commit hash to the file name so that we can distinguish individual packages build for SNAPSHOT versions.

docs/index.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@ Spark uses [Simple Build Tool](https://github.com/harrah/xsbt/wiki), which is bu
2222

2323
sbt/sbt package
2424

25+
Spark also supports building using Maven. If you would like to build using Maven, see the [instructions for building Spark with Maven](building-with-maven.html).
26+
2527
# Testing the Build
2628

2729
Spark comes with a number of sample programs in the `examples` directory.
@@ -72,6 +74,7 @@ of `project/SparkBuild.scala`, then rebuilding Spark (`sbt/sbt clean compile`).
7274

7375
**Other documents:**
7476

77+
* [Building Spark With Maven](building-with-maven.html): Build Spark using the Maven build tool
7578
* [Configuration](configuration.html): customize Spark via its configuration system
7679
* [Tuning Guide](tuning.html): best practices to optimize performance and memory use
7780
* [Bagel](bagel-programming-guide.html): an implementation of Google's Pregel on Spark

0 commit comments

Comments
 (0)