# Data wire formats

This section describes the data wire format of standard Gel types.

## Sets and array<>

Set and array values are represented as the following structure:

```c
struct SetOrArrayValue {
    // Number of dimensions, currently must
    // always be 0 or 1. 0 indicates an empty set or array.
    int32       ndims;

    // Reserved.
    int32       reserved0;

    // Reserved.
    int32       reserved1;

    // Dimension data.
    Dimension   dimensions[ndims];

    // Element data, the number of elements
    // in this array is the sum of dimension sizes:
    // sum((d.upper - d.lower + 1) for d in dimensions)
    Element     elements[];
};

struct Dimension {
    // Upper dimension bound, inclusive,
    // number of elements in the dimension
    // relative to the lower bound.
    int32       upper;

    // Lower dimension bound, always 1.
    int32       lower;
};

struct Element {
    // Encoded element data length in bytes.
    int32       length;

    // Element data.
    uint8       data[length];
};
```

Note: zero-length arrays (and sets) are represented as a 12-byte value where `dims` equal to zero regardless of the shape in type descriptor.

Sets of arrays are a special case. Every array within a set is wrapped in an Envelope. The full structure follows:

```c
struct SetOfArrayValue {
    // Number of dimensions, currently must
    // always be 0 or 1. 0 indicates an empty set.
    int32 ndims;

    // Reserved.
    int32 reserved0;

    // Reserved.
    int32 reserved1;

    // Dimension data. Same layout as above.
    Dimension dimensions[ndims];

    // Envelope data, the number of elements
    // in this array is the sum of dimension sizes:
    // sum((d.upper - d.lower + 1) for d in dimensions)
    Envelope elements[];
};

struct Envelope {
    // Encoded envelope element length in bytes.
    int32 length;

    // Number of elements, currently must
    // always be 1.
    int32 nelems;

    // Reserved.
    int32 reserved

    // Element data. Same layout as above.
    Element element[nelems];
};
```

## tuple<>, namedtuple<>, and object<>

Tuple, namedtuple and object values are represented as the following structure:

```c
struct TupleOrNamedTupleOrObjectValue {
    // Number of elements
    int32       nelems;

    // Element data.
    Element     elements[nelems];
};

struct Element {
    // Reserved.
    int32       reserved;

    // Encoded element data length in bytes.
    int32       length;

    // Element data.
    uint8       data[length];
};
```

Note that for objects, `Element.length` can be set to `-1`, which means an empty set.

## Sparse Objects

Sparse object values are represented as the following structure:

```c
struct SparseObjectValue {
    // Number of elements
    int32       nelems;

    // Element data.
    Element     elements[nelems];
};

struct Element {
    // Index of the element in the input shape.
    int32       index;

    // Encoded element data length in bytes.
    int32       length;

    // Element data.
    uint8       data[length];
};
```

## Ranges

Range values are represented as the following structure:

```c
struct Range {
    // A bit mask of range definition.
    uint8<RangeFlag> flags;

    // Lower boundary data.
    Boundary         lower;

    // Upper boundary data.
    Boundary         upper;
};

struct Boundary {
    // Encoded boundary data length in bytes.
    int32       length;

    // Boundary data.
    uint8       data[length];
};

enum RangeFlag {
    // Empty range.
    EMPTY   = 0x0001;

    // Included lower boundary.
    LB_INC  = 0x0002;

    // Included upper boundary.
    UB_INC  = 0x0004;

    // Inifinity (excluded) lower boundary.
    LB_INF  = 0x0008;

    // Infinity (excluded) upper boundary.
    UB_INF  = 0x0010;
};
```

## std::uuid

[`std::uuid`](https://docs.geldata.com/reference/stdlib/uuid.md#type::std::uuid) values are represented as a sequence of 16 unsigned byte values.

For example, the UUID value `b9545c35-1fe7-485f-a6ea-f8ead251abd3` is represented as:

```c
0xb9 0x54 0x5c 0x35 0x1f 0xe7 0x48 0x5f
0xa6 0xea 0xf8 0xea 0xd2 0x51 0xab 0xd3
```

## std::str

[`std::str`](https://docs.geldata.com/reference/stdlib/string.md#type::std::str) values are represented as a UTF-8 encoded byte string. For example, the `str` value `'Hello! 🙂'` is encoded as:

```c
0x48 0x65 0x6c 0x6c 0x6f 0x21 0x20 0xf0 0x9f 0x99 0x82
```

## std::bytes

[`std::bytes`](https://docs.geldata.com/reference/stdlib/bytes.md#type::std::bytes) values are represented as is.

## std::int16

[`std::int16`](https://docs.geldata.com/reference/stdlib/numbers.md#type::std::int16) values are represented as two bytes, most significant byte first.

For example, the `int16` value `6556` is represented as:

```c
0x19 0x9c
```

## std::int32

[`std::int32`](https://docs.geldata.com/reference/stdlib/numbers.md#type::std::int32) values are represented as four bytes, most significant byte first.

For example, the `int32` value `655665` is represented as:

```c
0x00 0x0a 0x01 0x31
```

## std::int64

[`std::int64`](https://docs.geldata.com/reference/stdlib/numbers.md#type::std::int64) values are represented as eight bytes, most significant byte first.

For example, the `int64` value `123456789987654321` is represented as:

```c
0x01 0xb6 0x9b 0x4b 0xe0 0x52 0xfa 0xb1
```

## std::float32

[`std::float32`](https://docs.geldata.com/reference/stdlib/numbers.md#type::std::float32) values are represented as an IEEE 754-2008 binary 32-bit value, most significant byte first.

For example, the `float32` value `-15.625` is represented as:

```c
0xc1 0x7a 0x00 0x00
```

## std::float64

[`std::float64`](https://docs.geldata.com/reference/stdlib/numbers.md#type::std::float64) values are represented as an IEEE 754-2008 binary 64-bit value, most significant byte first.

For example, the `float64` value `-15.625` is represented as:

```c
0xc0 0x2f 0x40 0x00 0x00 0x00 0x00 0x00
```

## std::decimal

[`std::decimal`](https://docs.geldata.com/reference/stdlib/numbers.md#type::std::decimal) values are represented as the following structure:

```c
struct Decimal {
    // Number of digits in digits[], can be 0.
    uint16               ndigits;

    // Weight of first digit.
    int16                weight;

    // Sign of the value
    uint16<DecimalSign>  sign;

    // Value display scale.
    uint16               dscale;

    // base-10000 digits.
    uint16                digits[ndigits];
};

enum DecimalSign {
    // Positive value.
    POS     = 0x0000;

    // Negative value.
    NEG     = 0x4000;
};
```

Decimal values are represented as a sequence of base-10000 *digits*. The first digit is assumed to be multiplied by *weight* * 10000, i.e. there might be up to weight + 1 digits before the decimal point. Trailing zeros may be absent. It is possible to have negative weight.

*dscale*, or display scale, is the nominal precision expressed as number of base-10 digits after the decimal point. It is always non-negative. *dscale* may be more than the number of physically present fractional digits, implying significant trailing zeroes. The actual number of digits physically present in the *digits* array contains trailing zeros to the next 4-byte increment (meaning that integer and fractional part are always distinct base-10000 digits).

For example, the decimal value `-15000.6250000` is represented as:

```c
// ndigits
0x00 0x04

// weight
0x00 0x01

// sign
0x40 0x00

// dscale
0x00 0x07

// digits
0x00 0x01 0x13 0x88 0x18 0x6a 0x00 0x00
```

## std::bool

[`std::bool`](https://docs.geldata.com/reference/stdlib/bool.md#type::std::bool) values are represented as an int8 with only two valid values: `0x01` for `true` and `0x00` for `false`.

## std::datetime

[`std::datetime`](https://docs.geldata.com/reference/stdlib/datetime.md#type::std::datetime) values are represented as a 64-bit integer, most sigificant byte first. The value is the number of *microseconds* between the encoded datetime and January 1st 2000, 00:00 UTC. A Unix timestamp can be converted into a Gel `datetime` value using this formula:

```c
edb_datetime = (unix_ts + 946684800) * 1000000
```

For example, the `datetime` value `'2019-05-06T12:00+00:00'` is encoded as:

```c
0x00 0x02 0x2b 0x35 0x9b 0xc4 0x10 0x00
```

See the [client libraries](https://docs.geldata.com/reference/using/datetime.md#ref-bindings-datetime) section for more info about how to handle different precision when encoding data.

## cal::local_datetime

[`cal::local_datetime`](https://docs.geldata.com/reference/stdlib/datetime.md#type::cal::local_datetime) values are represented as a 64-bit integer, most sigificant byte first. The value is the number of *microseconds* between the encoded datetime and January 1st 2000, 00:00.

For example, the `local_datetime` value `'2019-05-06T12:00'` is encoded as:

```c
0x00 0x02 0x2b 0x35 0x9b 0xc4 0x10 0x00
```

See the [client libraries](https://docs.geldata.com/reference/using/datetime.md#ref-bindings-datetime) section for more info about how to handle different precision when encoding data.

## cal::local_date

[`cal::local_date`](https://docs.geldata.com/reference/stdlib/datetime.md#type::cal::local_date) values are represented as a 32-bit integer, most sigificant byte first. The value is the number of *days* between the encoded date and January 1st 2000.

For example, the `local_date` value `'2019-05-06'` is encoded as:

```c
0x00 0x00 0x1b 0x99
```

## cal::local_time

[`cal::local_time`](https://docs.geldata.com/reference/stdlib/datetime.md#type::cal::local_time) values are represented as a 64-bit integer, most sigificant byte first. The value is the number of *microseconds* since midnight.

For example, the `local_time` value `'12:10'` is encoded as:

```c
0x00 0x00 0x00 0x0a 0x32 0xae 0xf6 0x00
```

See the [client libraries](https://docs.geldata.com/reference/using/datetime.md#ref-bindings-datetime) section for more info about how to handle different precision when encoding data.

## std::duration

The [`std::duration`](https://docs.geldata.com/reference/stdlib/datetime.md#type::std::duration) values are represented as the following structure:

```c
struct Duration {
    int64   microseconds;

    // deprecated, is always 0
    int32   days;

    // deprecated, is always 0
    int32   months;
};
```

For example, the `duration` value `'48 hours 45 minutes 7.6 seconds'` is encoded as:

```c
// microseconds
0x00 0x00 0x00 0x28 0xdd 0x11 0x72 0x80

// days
0x00 0x00 0x00 0x00

// months
0x00 0x00 0x00 0x00
```

See the [client libraries](https://docs.geldata.com/reference/using/datetime.md#ref-bindings-datetime) section for more info about how to handle different precision when encoding data.

## cal::relative_duration

The [`cal::relative_duration`](https://docs.geldata.com/reference/stdlib/datetime.md#type::cal::relative_duration) values are represented as the following structure:

```c
struct Duration {
    int64   microseconds;
    int32   days;
    int32   months;
};
```

For example, the `cal::relative_duration` value `'2 years 7 months 16 days 48 hours 45 minutes 7.6 seconds'` is encoded as:

```c
// microseconds
0x00 0x00 0x00 0x28 0xdd 0x11 0x72 0x80

// days
0x00 0x00 0x00 0x10

// months
0x00 0x00 0x00 0x1f
```

See the [client libraries](https://docs.geldata.com/reference/using/datetime.md#ref-bindings-datetime) section for more info about how to handle different precision when encoding data.

## cal::date_duration

[`cal::date_duration`](https://docs.geldata.com/reference/stdlib/datetime.md#type::cal::date_duration) values are represented as the following structure:

```c
struct DateDuration {
    int64   reserved;
    int32   days;
    int32   months;
};
```

For example, the `cal::date_duration` value `'1 years 2 days'` is encoded as:

```c
// reserved
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

// days
0x00 0x00 0x00 0x02

// months
0x00 0x00 0x00 0x0c
```

## std::json

[`std::json`](https://docs.geldata.com/reference/stdlib/json.md#type::std::json) values are represented as the following structure:

```c
struct JSON {
    uint8   format;
    uint8   jsondata[];
};
```

*format* is currently always `1`, and *jsondata* is a UTF-8 encoded JSON string.

## std::bigint

[`std::bigint`](https://docs.geldata.com/reference/stdlib/numbers.md#type::std::bigint) values are represented as the following structure:

```c
struct BigInt {
    // Number of digits in digits[], can be 0.
    uint16               ndigits;

    // Weight of first digit.
    int16                weight;

    // Sign of the value
    uint16<DecimalSign>  sign;

    // Reserved value, must be zero
    uint16               reserved;

    // base-10000 digits.
    uint16                digits[ndigits];
};

enum BigIntSign {
    // Positive value.
    POS     = 0x0000;

    // Negative value.
    NEG     = 0x4000;
};
```

Decimal values are represented as a sequence of base-10000 *digits*. The first digit is assumed to be multiplied by *weight* * 10000, i.e. there might be up to weight + 1 digits. Trailing zeros may be absent.

For example, the bigint value `-15000` is represented as:

```c
// ndigits
0x00 0x02

// weight
0x00 0x01

// sign
0x40 0x00

// reserved
0x00 0x00

// digits
0x00 0x01 0x13 0x88
```

## cfg::memory

[`cfg::memory`](https://docs.geldata.com/reference/stdlib/cfg.md#type::cfg::memory) values are represented as a number of *bytes* encoded as a 64-bit integer, most sigificant byte first.

For example, the `cfg::memory` value `123MiB` is represented as:

```c
0x00 0x00 0x00 0x00 0x07 0xb0 0x00 0x00
```

