About 36,800 results
Open links in new tab
  1. Apache Spark™ - Unified Engine for large-scale data analytics

    Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.

  2. Quick Start - Spark 4.0.1 Documentation

    Spark’s shell provides a simple way to learn the API, as well as a powerful tool to analyze data interactively. It is available in either Scala (which runs on the Java VM and is thus a good way …

  3. Downloads - Apache Spark

    Spark docker images are available from Dockerhub under the accounts of both The Apache Software Foundation and Official Images. Note that, these images contain non-ASF software …

  4. Overview - Spark 4.0.1 Documentation

    If you’d like to build Spark from source, visit Building Spark. Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS), and it should run on any platform that runs a …

  5. PySpark Overview — PySpark 4.0.1 documentation - Apache Spark

    Spark Connect is a client-server architecture within Apache Spark that enables remote connectivity to Spark clusters from any application. PySpark provides the client for the Spark …

  6. RDD Programming Guide - Spark 4.0.1 Documentation

    Spark supports two types of shared variables: broadcast variables, which can be used to cache a value in memory on all nodes, and accumulators, which are variables that are only “added” to, …

  7. Documentation | Apache Spark

    Apache Spark™ Documentation Setup instructions, programming guides, and other documentation are available for each stable version of Spark below: Spark

  8. Configuration - Spark 4.0.1 Documentation

    Spark provides three locations to configure the system: Spark properties control most application parameters and can be set by using a SparkConf object, or through Java system properties. …

  9. Examples - Apache Spark

    Spark allows you to perform DataFrame operations with programmatic APIs, write SQL, perform streaming analyses, and do machine learning. Spark saves you from learning multiple …

  10. Spark Streaming - Spark 4.0.1 Documentation - Apache Spark

    Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams. Data can be ingested from many sources …