An array is a fixed-size data structure that supports O(1) access to any element.
The vector package is what you should reach for if you only need single dimensional arrays, a beginner friendly introduction can be found here. If you need multidimensional arrays, use array, repa, or accelerate. If you need arrays of bytes specifically, use bytestring.
Notes on performance:
You should almost always use the unboxed version of the data structure unless you need lazily computed content.
ByteStringis already unboxed.
Benchmarks of various sequence-like data structured can be found here.
If you want speed, use accelerate. It generates parallel SIMD-vectorised code for multicore CPUs and CUDA GPUs.
- For a lighter-weight alternative to
accelerate, with still pretty good performance, use repa.
- For a lighter-weight alternative to
The de facto standard array type used in Haskell. Fast. Used by many libraries.
Note that it's not quite like C++
vector. (For instance, it doesn't support fast append.)
A tutorial can be found at https://haskell-lang.org/library/vector.
Supports list-like operations and has an extensive API.
Well-optimised; supports stream fusion (when you create an array and then consume it, sometimes it won't actually have to allocate any memory for the array).
vector-algorithms (sorting, binary search), vector-random (generate vectors with random values), hybrid-vectors (mix boxed and unboxed elements), patches-vector (diff 2 vectors)
vector-th-unbox (a helper for defining unboxed vector instances)
vector-instances (orphan instances), cereal-vector, vector-binary-instances
-- ordinary vectors import qualified Data.Vector as V -- mutable vectors import qualified Data.Vector.Mutable as VM -- unboxed vectors import qualified Data.Vector.Unboxed as U -- mutable unboxed vectors import qualified Data.Vector.Unboxed.Mutable as UM
Unlike in C++, growing a vector by one element isn't a constant-time operation (in C++ it is most of the time).
Arrays of bytes. (You could use
Vector Word8 instead, but most applications don't.)
Useful whenever you need, well, arrays of bytes. Can be faster than
Vector (at least it was true in 2011).
Parsing: most parsing libraries can parse bytestrings, but attoparsec is most commonly used, and
Data.Attoparsec.Zeptois even faster for some kinds of parsing. There are also binary-parser and scanner (which I haven't tried)
Fast builders: bytestring-tree-builder, fast-builder, buffer-builder
Encoding: base64-bytestring, base32-bytestring, base16-bytestring (hex encoding)
Show: bytestring-show, readable
Streaming: pipes-bytestring, quiver-bytestring
Data structures: rope (for manipulating very long strings efficiently, i.e. the rope data structure), bytestring-trie (trees of bytestrings with common prefixes, i.e. the trie data structure)
Various: bytestring-delta (diffing bytestrings / producing patches and applying them to bytestrings), bytestring-mmap (accessing files/devices as bytestrings, see mmap), stringsearch (splitting/search/replacement), bytestring-plain (compact representation for short bytestrings, claimed to be better than
<notes are empty>
<notes are empty>
Data.Array.Accelerate defines an embedded array language for computations for high-performance computing in Haskell. Computations on multi-dimensional, regular arrays are expressed in the form of parameterised collective operations, such as maps, reductions, and permutations. These computations may then be online compiled and executed on a range of architectures.
A short tutorial on how to implement the Mandelbrot set in Accelerate is available here.
A simple example:
As a simple example, consider the computation of a dot product of two vectors of floating point numbers:
dotp :: Acc (Vector Float) -> Acc (Vector Float) -> Acc (Scalar Float) dotp xs ys = fold (+) 0 (zipWith (*) xs ys)
Except for the type, this code is almost the same as the corresponding Haskell code on lists of floats. The types indicate that the computation may be online-compiled for performance - for example, using
Data.Array.Accelerate.LLVM.PTX it may be on-the-fly off-loaded to the GPU.
Larger example: LULESH
LULESH-accelerate is in implementation of the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics (LULESH) mini-app. LULESH represents a typical hydrodynamics code such as ALE3D, but is simplified and hard-coded to solve the Sedov blast problem on an unstructured hexahedron mesh.
The Accelerate implementation remains high-level while maintaining excellent performance compared to the reference C+OpenMP (> 2x faster) and CUDA (same performance) implementations.
Can generate code for both multicore CPUs as well as GPUs, from the same source program
Includes an array fusion and optimisation framework (similar to vector)
Generated code for tight numeric loops often much better than that generated by GHC
accelerate-llvm-native: Backend supporting parallel execution on multicore CPUs.
accelerate-llvm-ptx: Backend supporting parallel execution on CUDA-capable NVIDIA GPUs. Requires a GPU with compute capability 2.0 or greater. See this table of supported GPUs.
accelerate-examples: Computational kernels and applications showcasing the use of Accelerate as well as a regression test suite, supporting function and performance testing. Examples include:
- An implementation of the Canny edge detection algorithm
- An interactive Mandelbrot set generator
- A particle-based simulation of stable fluid flows
- An n-body simulation of gravitational attraction between solid particles
- An implementation of the PageRank algorithm
- A simple interactive ray tracer
- A cellular automata simulation
- A "password recovery" tool, for dictionary lookup of MD5 hashes
accelerate-io: Fast conversions between Accelerate arrays and other array formats (including vector and repa).
accelerate-fft: Discrete Fourier transforms, with FFI bindings to optimised implementations.
accelerate-bignum: Fixed-width large integer arithmetic.
accelerate-blas: Linear systems, matrix decompositions, and other numerical computations for use in Accelerate. Most operations are implemented efficiently via FFI calls to BLAS and LAPACK
colour-accelerate: Colour representations in Accelerate (RGB, sRGB, HSV, and HSL).
gloss-accelerate: Generate gloss pictures from Accelerate.
gloss-raster-accelerate: Parallel rendering of raster images and animations.
lens-accelerate: Lens operators for Accelerate types.
linear-accelerate: Linear vector spaces in Accelerate.
mwc-random-accelerate: Generate Accelerate arrays filled with high quality pseudorandom numbers.
- An introduction to Accelerate from Parallel and Concurrent Programming in Haskell
- Tutorial for generating a Mandelbrot fractal
- accelerate-examples – a package with example programs
-- The core library import Data.Array.Accelerate as A -- To generate parallel CPU code import Data.Array.Accelerate.LLVM.CPU as CPU -- To generate parallel code for CUDA GPUs import Data.Array.Accelerate.LLVM.PTX as PTX
Fast, auto-parallelizable, multidimensional, kinda-complicated arrays. Well-suited for processing images and other matrix-like things.
Like vector, does fusion – when you apply several operations to an array, it doesn't necessarily create intermediate arrays.
Auto-parallelizable – pass
+RTS -Nto your program and it'll probably become faster. (You can control the way parallelization happens.)
Supports advanced operations with multidimensional arrays (slicing, reshaping, transposition, etc).
Overall performance is almost C-like.
Elements in the array are saved in a flat structure, even when the array is multidimensional
Algorithms: repa-algorithms (DFT, FFT, randomness, and some other algorithms), repa-linear-algebra, hmatrix-repa, repa-fftw (bindings for fftw to do FFT/DFT)
Working with files: repa-io (read/write arrays as text, binary, or BMP), repa-devil (process images with repa), repa-sndfile (read/write sound files), JuicyPixels-repa (images)
- an introduction to Repa from Parallel and Concurrent Programming in Haskell
- repa-examples – a package with examples
import qualified Data.Array.Repa as R
This library is in the Haskell report, but it's used less often than vector. You might want to use it if you're on a new machine and don't have vector installed yet, or if you want multidimensional arrays (or arrays with indexing that doesn't start from 0).
Types and modules
First of all, an array index can be of any type belonging to the
Ix typeclass – this includes
Char, and tuples (which enables multidimensional arrays).
There are immutable and mutable arrays. Immutable:
Array i e– arrays (with index type
iand element type
UArray– unboxed arrays (faster and take less space, but all elements have to be evaluated)
IOArray– arrays in the
IOUArray– unboxed arrays in the
STArray– arrays in the
STUArray– unboxed arrays in the
StorableArray– arrays that are compatible with C
They live in their corresponding modules (
There are also modules that provide generic interfaces for arrays –
Data.Array.MArray for mutable arrays and
Data.Array.IArray for immutable arrays. They're reexported by other modules so you don't have to import them implicitly, with one exception:
Data.Array doesn't reexport
Data.Array.Base provides some unsafe operations that don't perform any checks (and therefore are faster). The documentation for it is unavailable, but you can look at the source.
Here are the unsafe operations for immutable arrays (
a is the type of the array):
-- Unsafe version of (!) unsafeAt :: Ix i => a i e -> Int -> e -- Unsafe version of (//) unsafeReplace :: Ix i => a i e -> [(Int, e)] -> a i e unsafeArray :: Ix i => (i,i) -> [(Int, e)] -> a i e unsafeAccum :: Ix i => (e -> e' -> e) -> a i e -> [(Int, e')] -> a i e unsafeAccumArray :: Ix i => (e -> e' -> e) -> e -> (i,i) -> [(Int, e')] -> a i e
And here are the ones for mutable arrays:
unsafeRead :: Ix i => a i e -> Int -> m e unsafeWrite :: Ix i => a i e -> Int -> e -> m () unsafeNewArray_ :: Ix i => (i,i) -> m (a i e)
Beware: all of these use zero-based indexing. If your array starts with 1, for instance, you still have to use 0 as the index to access the 1st element.