1. Download spark-1.3.1-hadoop-2.6.
2. Download source code: https://github.com/sujee81/SparkApps
3. Import "spark-load-from-db" project.
4. Modified pom.xml.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.sparkexpert</groupId> <artifactId>spark-load-from-db</artifactId> <version>1.0-SNAPSHOT</version> <properties> <spark.version>1.3.1</spark.version> <mysql.version>5.1.25</mysql.version> </properties> <dependencies> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>${spark.version}</version> </dependency> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>${mysql.version}</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.2</version> <configuration> <source>1.7</source> <target>1.7</target> <compilerArgument>-Xlint:all</compilerArgument> <showWarnings>true</showWarnings> <showDeprecation>true</showDeprecation> </configuration> </plugin> </plugins> </build></project>
5. Build and Run.
good article: how Spark DAG works under the convers in RDD.
ReplyDeletehttp://datasmell.blogspot.ca/2015/08/loading-sql-data-into-spark.html?showComment=1438883457414