download spark and unzip E.g
spark_Home=D:\udu\hk\spark-1.5.1
spark needs hadoop jars. download hadoop binaries for windows (hadoop 2.6.0) from
http://www.barik.net/archive/2015/01/19/172716/
unzip hadoop at some locaiton e.g.
hadoop_home=D:\udu\hk\hadoop-2.6.0
If your java_home or hadoop_home path contains space charcters in it ' ', you will need to convert to path to short paths:
- Create a batch script with following contents
@ECHO OFF
echo %~s1
- Run the above batch script file from java_home directory to get the short path for java-home
- Run the above batch script file from hadoop_home directory to get the short path for hadoop_home
set java_home=short path obtained from above command
set hadoop_home=short path obatained from above command.
Run following command and copy the classpath generated by the command for next step
%HADOOP_HOME%\bin\hadoop classpath
under spark_home\conf, create a file named "spark-env.cmd" like below
@echo off
set HADOOP_HOME=D:\Utils\hadoop-2.7.1
set PATH=%HADOOP_HOME%\bin;%PATH%
set SPARK_DIST_CLASSPATH=
on Command prompt
cd %spark_home%\bin
set SPARK_CONF_DIR=%SPARK_HOME%\conf
load-spark-env.cmd
spark-shell.cmd //To start spark shell
spark-submit.cmd
Refer below to create a spark word count example
http://www.robertomarchetto.com/spark_java_maven_example
To run a spark job (written using Java) word count example from above URL
spark-submit --class org.sparkexample.WordCount --master local[2] your_spark_job_jar Any_additional_parameters_needed_by_your_job_jar
References :
http://stackoverflow.com/questions/30906412/noclassdeffounderror-com-apache-hadoop-fs-fsdatainputstream-when-execute-spark-s
https://blogs.perficient.com/multi-shoring/blog/2015/05/07/setup-local-standalone-spark-node/
http://nishutayaltech.blogspot.com/2015/04/how-to-run-apache-spark-on-windows7-in.html