Skip to content

Apache Spark Libraries. Apache Spark has as its architectural foundation the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. The Dataframe API was released as an abstraction on top of the RDD, followed by the Dataset API. In Spark 1.x, the…

License

Notifications You must be signed in to change notification settings

bayudwiyansatria/library-java-apache-spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Platforms license

Code of Conduct Support Contribution


Bayu Dwiyan Satria - Apache Spark Libraries

Contributor Covenant

Github Actions Build Status Codacy Badge codecov

Apache Spark has as its architectural foundation the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. The Dataframe API was released as an abstraction on top of the RDD, followed by the Dataset API. In Spark 1.x, the RDD was the primary application programming interface (API), but as of Spark 2.x use of the Dataset API is encouraged even though the RDD API is not deprecated. The RDD technology still underlies the Dataset API.

Key Description
Name Bayu Dwiyan Satria - Apache Spark Libraries
Description Bayu Dwiyan Satria For Apache Spark Environment.
Site Site Page

Table of Contents

Dependencies

For more information see : The Central Repository.

Maven Central :

Name Group Artifact Version
JUnit junit junit 4.12
SLF4j org.slf4j slf4j-simple 1.7.25
SLF4j org.slf4j slf4j-api 1.7.25
HamCrest org.hamcrest hamcrest-core 1.3
HamCrest org.hamcrest hamcrest-library 1.3
BayuDwiyanSatria com.bayudwiyansatria core 1.1.8
BayuDwiyanSatria com.bayudwiyansatria ml 1.0
Scala org.scala scala-library 2.11.12
Log4j org.apache.logging.log4j log4j-api 2.11.12
ApacheSpark org.apache.spark spark-network-common_2.11 2.6.5
ApacheSpark org.apache.spark spark-catalyst_2.11 2.6.5
ApacheSpark org.apache.spark spark-launcher_2.11 2.6.5
ApacheSpark org.apache.spark spark-mllib_2.11 2.6.5
ApacheSpark org.apache.spark spark-yarn_2.11 2.6.5

Installation

Install the dependencies :

Maven :

Configure the following dependency in the pom file:

<dependency>
  <groupId>com.bayudwiyansatria</groupId>
  <artifactId>apache-spark</artifactId>
  <version>${bayudwiyansatria.apache-spark.version}</version>
</dependency>

Gradle :

Configure the following dependency in the build.gradle file:

implementation='com.bayudwiyansatria:apache-spark:${bayudwiyansatria.apache-spark.version}'

SBT

libraryDependencies += "com.bayudwiyansatria" % "apache-spark" % "${bayudwiyansatria.apache-spark.version}"

Important ! This will update from to your local repository . Be sure to swap out ${bayudwiyansatria.apache-spark.version} with the actual version of Spark Libraries.

For more information see : The Central Repository.

Development

-Release 1.0 : 2019, Sept.

-Release 1.1 : 2019, Nov.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

Looking to contribute to our code but need some help? There's a few ways to get information:

License

MIT | BAYU DWIYAN SATRIA

Copyright © 2017 - 2019 Bayu Dwiyan Satria. All Rights Reserved.

About

Apache Spark Libraries. Apache Spark has as its architectural foundation the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. The Dataframe API was released as an abstraction on top of the RDD, followed by the Dataset API. In Spark 1.x, the…

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Languages