## Data structuresBasics

General purpose data structure implementations, including sets, maps, sequences, and more. For fixed size arrays, see the "Arrays" page.

# Recommendations

The containers and unordered-containers packages are the de facto data structure libraries for Haskell. Both of these libraries contain tuned implementations of persistent data structures.

Sequences: `Sequence`

Sets: `Set`

, `HashSet`

, `IntSet`

Maps: `Map`

, `HashMap`

, `IntMap`

*Note: The maps above have 'lazy' and 'strict' implementations, in most cases you'll want the strict version, see the notes in the containers and unordered-containers sections below for more info.*

**Performance:**

The `containers`

and `unordered-containers`

packages both have the Big-O complexity of each operation documented. Benchmark results comparing different implementations can be be found at the links below.

Sequences benchmarks: https://github.com/haskell-perf/sequences

Set benchmarks: https://github.com/haskell-perf/sets

Dictionary benchmarks: https://github.com/haskell-perf/dictionaries

# Specialized data structures

- Heterogeneous lists: HList
- Tries: bytestring-trie
- Graphs: fgl, algebraic-graphs
- Difference lists: dlist
- Sized lists: slist
- Implicit treaps: treap

# Data science and machine learning

If you're doing data science (numerical methods, machine learning, etc.) then see datahaskell.org's "current environment" page for more information on data structures and algorithms suited for this problem space.

# Concurrency

stm-containers is a great library for situations where you need to access a map/set from several threads but avoid doing global locking.

**Summary**

Efficient general-purpose implementations of various immutable ordered container types. This package includes: `Map`

s, `Set`

s, `Tree`

s, `Graph`

s, and `Seq`

s.

*Strict or Lazy map?* If:

- you will eventually need all the values stored, or
- the stored values don't represent large virtual data structures to be lazily computed

then you should use the strict version of each of the container type (`Data.IntMap.Strict`

, `Data.Map.Strict`

).

Tutorial: https://haskell-containers.readthedocs.io.

**Summary**

An efficient finite map from (byte)strings to values.

If all you need to do is associate strings to values you should use either `Map`

from containers or `HashMap`

from unordered-containers. You should only use this if you need the extra capabilities that a trie can provide (structure sharing for highly redundant keys, submap of all keys with a given prefix, extracting minimum and maximum keys, etc).

**Summary**

A list-like type supporting O(1) append, useful for efficient logging and pretty printing.

**Example**

Using DList as the state type when printing a tree with the Writer monad from dlist docs:

```
import Control.Monad.Writer
import Data.DList
data Tree a = Leaf a | Branch (Tree a) (Tree a)
flatten_writer :: Tree x -> DList x
flatten_writer = snd . runWriter . flatten
where
flatten (Leaf x) = tell (singleton x)
flatten (Branch x y) = flatten x >> flatten y
```

**Summary**

This package introduces sized list data type — Slist. The data type has the following shape:

```
data Slist a = Slist
{ sList :: [a]
, sSize :: Size
}
```

As you can see along with the familiar list, it contains Size field that represents the size of the structure. Slists can be finite or infinite, and this is expressed with Size.

```
data Size
= Size Int
| Infinity
```

This representation of the list gives some additional advantages. Getting the length of the list is the "free" operation (runs in O(1)). This property helps to improve the performance for a bunch of functions like take, drop, at, etc. But also it doesn't actually add any overhead on the existing functions.

**Summary**

The `threap`

package implements a tree-like data structure called *implicit treap*. This
data structure implements interface similar to random-access arrays, but with
fast (logarithmic time complexity)
`insert`

/`delete`

/`split`

/`merge`

/`take`

/`drop`

/`rotate`

operations. In addition,
*treap* allows you to specify and measure values of any monoids on a segment,
like a sum of elements or minimal element on some contiguous part of the array.

**When to use?**

Use this package when you want the following operations to be fast:

- Access elements by index.
- Insert elements by index.
- Delete elements by index.
- Calculate monoidal operation (like sum, product, min, etc.) of all elements between two indices.
- Call slicing operations like
`take`

or`drop`

or`split`

.