An array is a fixed-size data structure that supports O(1) access to any element.

Recommendations

The vector package is what you should reach for if you only need single dimensional arrays, a beginner friendly introduction can be found here. If you need multidimensional arrays, use array, repa, or accelerate. If you need arrays of bytes specifically, use bytestring.

Notes on performance:

  • You should almost always use the unboxed version of the data structure unless you need lazily computed content. ByteString is already unboxed.

  • Benchmarks of various sequence-like data structured can be found here.

  • If you want speed, use accelerate. It generates parallel SIMD-vectorised code for multicore CPUs and CUDA GPUs.

    • For a lighter-weight alternative to accelerate, with still pretty good performance, use repa.
edit description
or press Ctrl+Enter to savemarkdown supported
#
vector (Hackage)
other
move item up move item down edit item info delete item
Summary edit summary

The de facto standard array type used in Haskell. Fast. Used by many libraries.

Note that it's not quite like C++ vector. (For instance, it doesn't support fast append.)

A tutorial can be found at https://haskell-lang.org/library/vector.

Summary quit editing summary
Prosedit prosquit editing pros
  • Supports list-like operations and has an extensive API.
    move trait up move trait down edit trait delete trait
  • Well-optimised; supports stream fusion (when you create an array and then consume it, sometimes it won't actually have to allocate any memory for the array).
    move trait up move trait down edit trait delete trait

press Ctrl+Enter or Enter to addmarkdown supportededit off
Consedit consquit editing cons
  • Doesn't support multidimensional arrays.
    move trait up move trait down edit trait delete trait

press Ctrl+Enter or Enter to addmarkdown supportededit off
Ecosystemedit ecosystem
Ecosystemquit editing ecosystemor press Ctrl+Enter to savemarkdown supported
Notes
collapse notesedit notes

Links

Imports

-- ordinary vectors
import qualified Data.Vector as V

-- mutable vectors
import qualified Data.Vector.Mutable as VM

-- unboxed vectors
import qualified Data.Vector.Unboxed as U

-- mutable unboxed vectors
import qualified Data.Vector.Unboxed.Mutable as UM

Gotchas

Unlike in C++, growing a vector by one element isn't a constant-time operation (in C++ it is most of the time).

collapse notesedit notes
#
bytestring (Hackage)
other
move item up move item down edit item info delete item
Summary edit summary

Arrays of bytes. (You could use Vector Word8 instead, but most applications don't.)

Useful whenever you need, well, arrays of bytes. Can be faster than Vector (at least it was true in 2011).

Summary quit editing summary
Prosedit prosquit editing pros
  • Widely used.
    move trait up move trait down edit trait delete trait
  • Can be faster than Vectors.
    move trait up move trait down edit trait delete trait
  • Can be infinite (as lazy bytestrings).
    move trait up move trait down edit trait delete trait

press Ctrl+Enter or Enter to addmarkdown supportededit off
Consedit consquit editing cons
  • There's no mutable interface for bytestrings (apart from convering them to CStrings and then using pointers).
    move trait up move trait down edit trait delete trait

press Ctrl+Enter or Enter to addmarkdown supportededit off
Ecosystemedit ecosystem
Ecosystemquit editing ecosystemor press Ctrl+Enter to savemarkdown supported
Notes
collapse notesedit notes

<notes are empty>

add something!

#
other
move item up move item down edit item info delete item
Summary edit summary

Data.Array.Accelerate defines an embedded array language for computations for high-performance computing in Haskell. Computations on multi-dimensional, regular arrays are expressed in the form of parameterised collective operations, such as maps, reductions, and permutations. These computations may then be online compiled and executed on a range of architectures.

A short tutorial on how to implement the Mandelbrot set in Accelerate is available here.

A simple example:

As a simple example, consider the computation of a dot product of two vectors of floating point numbers:

dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float)
dotp xs ys = fold (+) 0 (zipWith (*) xs ys)

Except for the type, this code is almost the same as the corresponding Haskell code on lists of floats. The types indicate that the computation may be online-compiled for performance - for example, using Data.Array.Accelerate.LLVM.PTX it may be on-the-fly off-loaded to the GPU.

Larger example: LULESH

LULESH-accelerate is in implementation of the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH) mini-app. LULESH represents a typical hydrodynamics code such as ALE3D, but is simplified and hard-coded to solve the Sedov blast problem on an unstructured hexahedron mesh.

The Accelerate implementation remains high-level while maintaining excellent performance compared to the reference C+OpenMP (> 2x faster) and CUDA (same performance) implementations.

LULESH mesh

Summary quit editing summary
Prosedit prosquit editing pros
  • Can generate code for both multicore CPUs as well as GPUs, from the same source program
    move trait up move trait down edit trait delete trait
  • Includes an array fusion and optimisation framework (similar to vector)
    move trait up move trait down edit trait delete trait
  • Generated code for tight numeric loops often much better than that generated by GHC
    move trait up move trait down edit trait delete trait

press Ctrl+Enter or Enter to addmarkdown supportededit off
Consedit consquit editing cons
  • Requires external dependencies (LLVM, CUDA)
    move trait up move trait down edit trait delete trait
  • Currently not well tested on Windows
    move trait up move trait down edit trait delete trait
  • Implemented as an embedded language, it can be a bit trickier to get the hang of initially, compared to a native library such as vector or repa.
    move trait up move trait down edit trait delete trait

press Ctrl+Enter or Enter to addmarkdown supportededit off
Ecosystemedit ecosystem
  • accelerate-llvm-native: Backend supporting parallel execution on multicore CPUs.

  • accelerate-llvm-ptx: Backend supporting parallel execution on CUDA-capable NVIDIA GPUs. Requires a GPU with compute capability 2.0 or greater. See this table of supported GPUs.

  • accelerate-examples: Computational kernels and applications showcasing the use of Accelerate as well as a regression test suite, supporting function and performance testing. Examples include:

    • An implementation of the Canny edge detection algorithm
    • An interactive Mandelbrot set generator
    • A particle-based simulation of stable fluid flows
    • An n-body simulation of gravitational attraction between solid particles
    • An implementation of the PageRank algorithm
    • A simple interactive ray tracer
    • A cellular automata simulation
    • A "password recovery" tool, for dictionary lookup of MD5 hashes
  • accelerate-io: Fast conversions between Accelerate arrays and other array formats (including vector and repa).

  • accelerate-fft: Discrete Fourier transforms, with FFI bindings to optimised implementations.

  • accelerate-bignum: Fixed-width large integer arithmetic.

  • accelerate-blas: Linear systems, matrix decompositions, and other numerical computations for use in Accelerate. Most operations are implemented efficiently via FFI calls to BLAS and LAPACK

  • colour-accelerate: Colour representations in Accelerate (RGB, sRGB, HSV, and HSL).

  • gloss-accelerate: Generate gloss pictures from Accelerate.

  • gloss-raster-accelerate: Parallel rendering of raster images and animations.

  • lens-accelerate: Lens operators for Accelerate types.

  • linear-accelerate: Linear vector spaces in Accelerate.

  • mwc-random-accelerate: Generate Accelerate arrays filled with high quality pseudorandom numbers.

Ecosystemquit editing ecosystemor press Ctrl+Enter to savemarkdown supported
Notes
collapse notesedit notes

Links

Imports

-- The core library
import Data.Array.Accelerate as A

-- To generate parallel CPU code
import Data.Array.Accelerate.LLVM.CPU as CPU

-- To generate parallel code for CUDA GPUs
import Data.Array.Accelerate.LLVM.PTX as PTX
collapse notesedit notes
#
other
move item up move item down edit item info delete item
Summary edit summary

Fast, auto-parallelizable, multidimensional, kinda-complicated arrays. Well-suited for processing images and other matrix-like things.

Summary quit editing summary
Prosedit prosquit editing pros
  • Like vector, does fusion – when you apply several operations to an array, it doesn't necessarily create intermediate arrays.
    move trait up move trait down edit trait delete trait
  • Auto-parallelizable – pass +RTS -N to your program and it'll probably become faster. (You can control the way parallelization happens.)
    move trait up move trait down edit trait delete trait
  • Supports advanced operations with multidimensional arrays (slicing, reshaping, transposition, etc).
    move trait up move trait down edit trait delete trait
  • Overall performance is almost C-like.
    move trait up move trait down edit trait delete trait
  • Elements in the array are saved in a flat structure, even when the array is multidimensional
    move trait up move trait down edit trait delete trait

press Ctrl+Enter or Enter to addmarkdown supportededit off
Consedit consquit editing cons
  • Harder to use than other libraries. Uses lots of type synonyms, polymorphism, and generally looks complicated.
    move trait up move trait down edit trait delete trait
  • Beside looking complicated, it easily can cause segmentation faults at runtime
    move trait up move trait down edit trait delete trait

press Ctrl+Enter or Enter to addmarkdown supportededit off
Ecosystemedit ecosystem
Ecosystemquit editing ecosystemor press Ctrl+Enter to savemarkdown supported
#
array (Hackage)
other
move item up move item down edit item info delete item
Summary edit summary

This library is in the Haskell report, but it's used less often than vector. You might want to use it if you're on a new machine and don't have vector installed yet, or if you want multidimensional arrays (or arrays with indexing that doesn't start from 0).

Summary quit editing summary
Prosedit prosquit editing pros
  • Bundled with GHC.
    move trait up move trait down edit trait delete trait
  • Supports multidimensional arrays.
    move trait up move trait down edit trait delete trait
  • Keys do not have to be integers.
    move trait up move trait down edit trait delete trait

press Ctrl+Enter or Enter to addmarkdown supportededit off
Consedit consquit editing cons
  • The API is pretty minimal.
    move trait up move trait down edit trait delete trait

press Ctrl+Enter or Enter to addmarkdown supportededit off
Ecosystemedit ecosystem
Ecosystemquit editing ecosystemor press Ctrl+Enter to savemarkdown supported
Notes
collapse notesedit notes

Types and modules

First of all, an array index can be of any type belonging to the Ix typeclass – this includes Int, Integer, Bool, Char, and tuples (which enables multidimensional arrays).

There are immutable and mutable arrays. Immutable:

  • Array i e – arrays (with index type i and element type e)
  • UArray – unboxed arrays (faster and take less space, but all elements have to be evaluated)

Mutable:

  • IOArray – arrays in the IO monad
  • IOUArray – unboxed arrays in the IO monad
  • STArray – arrays in the ST monad
  • STUArray – unboxed arrays in the ST monad
  • StorableArray – arrays that are compatible with C

They live in their corresponding modules (Data.Array, Data.Array.Unboxed, Data.Array.IO, Data.Array.ST, Data.Array.Storable).

There are also modules that provide generic interfaces for arrays – Data.Array.MArray for mutable arrays and Data.Array.IArray for immutable arrays. They're reexported by other modules so you don't have to import them implicitly, with one exception: Data.Array doesn't reexport Data.Array.IArray.

Unsafe operations

Data.Array.Base provides some unsafe operations that don't perform any checks (and therefore are faster). The documentation for it is unavailable, but you can look at the source.

Here are the unsafe operations for immutable arrays (a is the type of the array):

-- Unsafe version of (!)
unsafeAt         :: Ix i => a i e -> Int -> e
-- Unsafe version of (//)
unsafeReplace    :: Ix i => a i e -> [(Int, e)] -> a i e

unsafeArray      :: Ix i => (i,i) -> [(Int, e)] -> a i e
unsafeAccum      :: Ix i => (e -> e' -> e) -> a i e -> [(Int, e')] -> a i e
unsafeAccumArray :: Ix i => (e -> e' -> e) -> e -> (i,i) -> [(Int, e')] -> a i e

And here are the ones for mutable arrays:

unsafeRead      :: Ix i => a i e -> Int -> m e
unsafeWrite     :: Ix i => a i e -> Int -> e -> m ()

unsafeNewArray_ :: Ix i => (i,i) -> m (a i e)

Beware: all of these use zero-based indexing. If your array starts with 1, for instance, you still have to use 0 as the index to access the 1st element.

collapse notesedit notes