PySpark: Eclipse Integration

This tutorial will guide you through configuring PySpark on Eclipse.

First you need to install Eclipse.

You need to add “pyspark.zip” and “py4j-0.10.7-src.zip” to “Libraries” for the Python Interpreter.

Next you need to configure the Environment variables for PySpark.

Test that it works!

from pyspark import SparkConf, SparkContext
from pyspark.sql import SparkSession

def init_spark():
    spark = SparkSession.builder.appName("HelloWorld").getOrCreate()
    sc = spark.sparkContext
    return spark,sc

if __name__ == '__main__':
    spark,sc = init_spark()
    nums = sc.parallelize([1,2,3,4])
    print(nums.map(lambda x: x*x).collect())

Eclipse Installation

In this tutorial I will show you how to install Eclipse using Ubuntu 16.04.

Install JDK 8

sudo apt-get install openjdk-8-jdk

Download Oxygen.

tar -xzvf eclipse-inst-linux64.tar.gz
~/eclipse-installer/eclipse-inst

Install Eclipse

Eclipse Desktop Shortcut

cd ~/Desktop
touch eclipse.desktop
chmod u+x eclipse.desktop
nano eclipse.desktop

#Add the below to the file

[Desktop Entry]
Type=Application
Name=Eclipse
Icon=~/eclipse/java-oxygen/eclipse/icon.xpm
Exec=~/eclipse/java-oxygen/eclipse/eclipse
Terminal=false
Categories=Development;IDE;Java;
StartupWMClass=Eclipse

Eclipse/Maven: Jacoco Integration

This tutorial will guide you through configuring Jacoco in your Maven application and install the Eclipse plugin.

First Open Eclipse MarketPlace then search for “EclEmma”.

Next you need to click Install and accept the license agreement reading it first. Then it will complete and need to restart Eclipse.

Once Eclipse opens again you can edit “Code Coverage” from “Window/Preferences”.

You can now run “Code Coverage” through Eclipse by right clicking your project. As you can see below I have not written any unit tests yet :(.

 

Pom.xml

Build

<build>
	<plugins>
		<plugin>
			<groupId>org.jacoco</groupId>
			<artifactId>jacoco-maven-plugin</artifactId>
			<version>0.8.1</version>
			<configuration>
				<!-- Path to the output file for execution data. (Used in initialize 
					phase) -->
				<destFile>${project.build.directory}/target/coverage-reports/jacoco-unit.exec</destFile>
				<!-- File with execution data. (Used in package phase) -->
				<dataFile>${project.build.directory}/target/coverage-reports/jacoco-unit.exec</dataFile>
				<excludes>
				</excludes>
			</configuration>
			<executions>
				<execution>
					<id>jacoco-initialization</id>
					<phase>initialize</phase>
					<goals>
						<!-- https://www.eclemma.org/jacoco/trunk/doc/prepare-agent-mojo.html -->
						<goal>prepare-agent</goal>
					</goals>
				</execution>
				<execution>
					<id>jacoco-site</id>
					<phase>package</phase>
					<goals>
						<!-- https://www.eclemma.org/jacoco/trunk/doc/report-mojo.html -->
						<goal>report</goal>
					</goals>
				</execution>
			</executions>
		</plugin>
	</plugins>
</build>

 

 

 

Eclipse/Maven: FindBugs/SpotBugs Integration

This tutorial will guide you through configuration FindBugs/SpotBugs in your Maven application and install the Eclipse plugin.

First Open Eclipse MarketPlace then search for “SpotBugs”.

Next you need to click Install and accept the license agreement reading it first. Then it will complete and need to restart Eclipse.

Once Eclipse opens again you right click the project(s) you want to activate FindBugs/SpotBugs for and click “Properties”. Click “SpotBugs” and then make the following changes.

Now you can run SpotBugs by right clicking your project and selecting SpotBugs then “Find Bugs”.

Pom.xml

Reporting

<reporting>
	<plugins>
		<plugin>
			<groupId>com.github.spotbugs</groupId>
			<artifactId>spotbugs-maven-plugin</artifactId>
			<version>3.1.3</version>
		</plugin>
	</plugins>
</reporting>

Build

<build>
	<plugins>
		<plugin>
			<groupId>com.github.spotbugs</groupId>
			<artifactId>spotbugs-maven-plugin</artifactId>
			<version>3.1.3</version>
			<dependencies>
				<dependency>
					<groupId>com.github.spotbugs</groupId>
					<artifactId>spotbugs</artifactId>
					<version>3.1.3</version>
				</dependency>
			</dependencies>
			<configuration>
				<effort>Max</effort>
				<threshold>Low</threshold>
				<failOnError>true</failOnError>
				<plugins>
					<plugin>
						<groupId>com.h3xstream.findsecbugs</groupId>
						<artifactId>findsecbugs-plugin</artifactId>
						<version>LATEST</version>
					</plugin>
				</plugins>
			</configuration>
		</plugin>
	</plugins>
</build>

Maven Commands

mvn spotbugs:spotbugs

#Generates the report site
mvn site

Eclipse/Maven: PMD Integration

This tutorial will guide you through configuring PMD in your Maven application and install the Eclipse plugin.

First Open Eclipse MarketPlace then search for “PMD”.

Next you need to click Install and accept the license agreement reading it first. Then it will complete and need to restart Eclipse.

Once Eclipse opens again you right click the project(s) you want to activate PMD for and click “Properties”. Click “PMD” and then click “Enable PMD for this project”. You will need to create a rule set. To do that go here.

Pom.xml

Reporting

You will need both reporting plugins in your project. “maven-jxr-plugin” fixes an issue with not finding the xRef.

<reporting>
	<plugins>
		<plugin>
			<groupId>org.apache.maven.plugins</groupId>
			<artifactId>maven-pmd-plugin</artifactId>
			<version>3.9.0</version>
		</plugin>
		<plugin>
			<groupId>org.apache.maven.plugins</groupId>
			<artifactId>maven-jxr-plugin</artifactId>
			<version>2.5</version>
		</plugin>
	</plugins>
</reporting>

Build

You will need to configure the following to use with “mvn pmd:???” commands.

<build>
	<plugins>
		<plugin>
			<groupId>org.apache.maven.plugins</groupId>
			<artifactId>maven-pmd-plugin</artifactId>
			<version>3.9.0</version>
			<configuration>
				<failOnViolation>true</failOnViolation>
				<verbose>true</verbose>
				<targetJdk>1.8</targetJdk>
				<includeTests>false</includeTests>
				<excludes>
				</excludes>
				<excludeRoots>
					<excludeRoot>target/generated-sources/stubs</excludeRoot>
				</excludeRoots>
			</configuration>
			<executions>
				<execution>
					<phase>test</phase>
					<goals>
						<goal>pmd</goal>
						<goal>cpd</goal>
						<goal>cpd-check</goal>
						<goal>check</goal>
					</goals>
				</execution>
			</executions>
		</plugin>
	</plugins>
</build>

Maven Commands

mvn pmd:check
mvn pmd:pmd

#cdp checks for copy paste issues

mvn pmd:cdp-check
mvn pmd:cdp

#Generates the report site
mvn site

Eclipse/Maven: CheckStyle Integration

This tutorial will guide you through configuration CheckStyle in your Maven application and install the Eclipse plugin.

First Open Eclipse MarketPlace then search for “Checkstyle”.

Next you need to click Install and accept the license agreement reading it first. Then it will complete and need to restart Eclipse.

Once Eclipse opens again you right click the project(s) you want to activate CheckStyle for and activate it. There are also properties you can configure through Eclipse’s preferences. I suggest you go there and configure it. You can also customize your checkstyle or make your own. Up to you.

Pom.xml

Build

When you run “mvn checkstyle:check” if will then run and will fail the build if you have any issues.

<build>
	<plugins>
		<plugin>
			<groupId>org.apache.maven.plugins</groupId>
			<artifactId>maven-checkstyle-plugin</artifactId>
			<version>3.0.0</version>
			<executions>
				<execution>
					<id>validate</id>
					<phase>validate</phase>
					<configuration>
						<encoding>UTF-8</encoding>
						<consoleOutput>true</consoleOutput>
						<failsOnError>true</failsOnError>
						<linkXRef>false</linkXRef>
					</configuration>
					<goals>
						<goal>check</goal>
					</goals>
				</execution>
			</executions>
		</plugin>
	</plugins>
</build>

Reporting

You can generate a HTML report with the following by running “mvn checkstyle:checkstyle”.

<reporting>
	<plugins>
		<plugin>
			<groupId>org.apache.maven.plugins</groupId>
			<artifactId>maven-checkstyle-plugin</artifactId>
			<version>3.0.0</version>
			<reportSets>
				<reportSet>
					<reports>
						<report>checkstyle</report>
					</reports>
				</reportSet>
			</reportSets>
		</plugin>
	</plugins>
</reporting>

Python IDE Installation for Eclipse

This tutorial will guide you through configuring Eclipse for Python. Ensure you have followed the tutorial on installing Eclipse first.

You need to install PyDev. Open Eclipse. Click Help–>Install New Software. In the “work with” put “http://pydev.org/updates” and click Add. Follow the prompts and you are done for now.

You also need to ensure you have installed Python 3.6. You can get it from here. You should add the environment variable “PYTHON_HOME” and enter the python directory. I would also add “%PYTHON_HOME%\;%PYTHON_HOME%\Scripts” to your “path” environment variable.

I would also at this time install PIP. Just to make sure you have everything you will need. You will need to download “get-pip.py“. You will need to put it into the “Scripts” folder where you install Python 3.6. Navigate into your Scripts folder and then run “python get-pip.py”.

Once PyDev is installed and you have restarted Eclipse. Open PyDev perspective. Go to Window–>Perspective–>Open Perspective–>Other. Then select PyDev and click Ok.

Optional:

TypeScript IDE:

Used for React. It’s really handy. Open Eclipse. Click Help–>Install New Software. In the “work with” put “http://oss.opensagres.fr/typescript.ide/1.1.0/” and click Add. Follow the prompts and you are done for now. make sure you select “Embed Node.js” and “TypeScript IDE”.

HTML Editor:

Used for HTML files. Click Help –> Eclipse Marketplace. Search for “HTML Editor”. Then click install. After it is installed and Eclipse is restarted Click Window –> Preferences –> Web. Under “HTML Files” change encoding to “ISO 10646/Unicode(UTF-8). Under “Editor” add “div”. You can get more info and configuration from here.

Java IDE Installation for Eclipse

This tutorial will guide you through configuring Eclipse for Java. Ensure you have followed the tutorial on installing Eclipse first.

You should open java and debug perspectives. To do that just go to “Window”–>”Open Perspective”–>”Java”. This opens Java perspective. To open “Debug” you need to go to “Window”–>”Open Perspective”–>”Other”. Select “Debug” and hit “Ok”.

You should also install Maven 2. Go to “Help”–>”Install New Software”. In the “Work With” type “http://m2eclipse.sonatype.org/sites/m2e” and hit “Add”. Then add the name “Maven2” or whatever name you want and hit “Ok”. Then check “Maven Integration for Eclipse” and hit “Next”. Hit “Next” again for “Install Details” and accept the license agreement. and hit “Finish”. You will need to restart.

If you want you can also open “Project Explorer”, “Markers” and “Problems” views from “Window”–>”Show View”–>”Other”.

FindBugs is also a nice to have and I recommend having it :). Go to “Help”–>”Install New Software”. In the “Work With” type “http://findbugs.cs.umd.edu/eclipse” and hit “Add”. Then add the name “FindBugs” or whatever name you want and hit “Ok”. Then check “FindBugs” and hit “Next”. Hit “Next” again for “Install Details” and accept the license agreement. and hit “Finish”. You will need to restart.

You should open the “FindBugs” perspective as well. To do that just go to “Window”–>”Open Perspective”–>”Other”. Select “FindBugs” and hit “Ok”.

Don’t forget to lock Eclipse to launcher if you want.

Optional:

TypeScript IDE:

Used for React. It’s really handy. Open Eclipse. Click Help–>Install New Software. In the “work with” put “http://oss.opensagres.fr/typescript.ide/1.1.0/” and click Add. Follow the prompts and you are done for now. make sure you select “Embed Node.js” and “TypeScript IDE”.

HTML Editor:

Used for HTML files. Click Help –> Eclipse Marketplace. Search for “HTML Editor”. Then click install. After it is installed and Eclipse is restarted Click Window –> Preferences –> Web. Under “HTML Files” change encoding to “ISO 10646/Unicode(UTF-8). Under “Editor” add “div”. You can get more info and configuration from here.