Posts

Showing posts from December, 2020

Wordcount project in Spark with JAVA using Maven and Gradle

Follow the steps for Wordcount project in Spark with JAVA using Maven and Gradle Download winutils.exe from https://github.com/steveloughran/winutils/tree/master/hadoop-3.0.0/bin Move it in F:\BigData\hadoop\bin Download pre-built Spark from http://spark.apache.org/downloads.html Extract and move to F:\BigData Set Environment Variables: 1. SPARK_HOME - F:\BigData\spark-3.0.1-bin-hadoop2.7 2. HADOOP_HOME - F:\BigData\hadoop Path: 1. %SPARK_HOME%\bin 2. %HADOOP_HOME%\bin Confirm installation via CMD Enter: spark-shell It should show the version of spark as well as shell should start Set up the Hadoop Scratch directory Create the following folder: C:\tmp\hive Navigate to F:\BigData\hadoop\bin Set permissions by typing  winutils.exe chmod -R 777 C:\tmp\hive For Maven project- Create new project: D:\Codes\Spark\First project Create class SimpleApp.java in D:\Codes\Spark\First project\src\main\java /* SimpleApp.java */ import org.apache.spark.api.java.*;...