arquero

Arquero API Reference

Standard Functions
Aggregate Functions
- any, bins
- count, distinct, valid, invalid
- max, min, sum, product
- mean, average, mode, median, quantile
- stdev, stdevp, variance, variancep
- corr, covariance, covariancep
- array_agg, array_agg_distinct, object_agg, map_agg, entries_agg
Window Functions
- row_number, rank, avg_rank, dense_rank
- percent_rank, cume_dist, ntile
- lag, lead, first_value, last_value, nth_value
- fill_down, fill_up

Standard Functions

Standard library of table expression functions. The op object exports these as standard JavaScript functions that behave the same whether invoked inside or outside a table expression context.

Array Functions

# op.compact(array) · Source

Returns a new compacted array with invalid values (null, undefined, NaN) removed.

array: The array to compact.

Examples

op.compact([1, null, 2, undefined, NaN, 3]) // [ 1, 2, 3 ]

# op.concat(…values) · Source

Merges two or more arrays in sequence, returning a new array.

values: The arrays to merge.

# op.join(array[, delimiter]) · Source

Creates and returns a new string by concatenating all of the elements in an array (or an array-like object), separated by commas or a specified delimiter string. If the array has only one item, then that item will be returned without using the delimiter.

array: The input array value.
join: The delimiter string (default ',').

# op.includes(array, value[, index]) · Source

Determines whether an array includes a certain value among its entries, returning true or false as appropriate.

array: The input array value.
value: The value to search for.
index: The integer index to start searching from (default 0).

# op.indexof(sequence, value) · Source

Returns the first index at which a given value can be found in the sequence (array or string), or -1 if it is not present.

sequence: The input array or string value.
value: The value to search for.

# op.lastindexof(sequence, value) · Source

Returns the last index at which a given value can be found in the sequence (array or string), or -1 if it is not present.

sequence: The input array or string value.
value: The value to search for.

# op.length(sequence) · Source

Returns the length of the input sequence (array or string).

sequence: The input array or string value.

# op.pluck(array, property) · Source

Returns a new array in which the given property has been extracted for each element in the input array.

array: The input array value.
property: The property name string to extract. Nested properties are not supported: the input "a.b" will indicates a property with that exact name, not a nested property "b" of the object "a".

# op.slice(sequence[, start, end]) · Source

Returns a copy of a portion of the input sequence (array or string) selected from start to end (end not included) where start and end represent the index of items in the sequence.

sequence: The input array or string value.
start: The starting integer index to copy from (inclusive, default 0).
end: The ending integer index to copy from (exclusive, default sequence.length).

# op.reverse(array) · Source

Returns a new array with the element order reversed: the first array element becomes the last, and the last array element becomes the first. The input array is unchanged.

array: The input array value.

# op.sequence([start,] stop[, step]) · Source

Returns an array containing an arithmetic sequence from the start value to the stop value, in step increments. If step is positive, the last element is the largest start + i * step less than stop; if step is negative, the last element is the smallest start + i * step greater than stop. If the returned array would contain an infinite number of values, an empty range is returned.

start: The starting value of the sequence (default 0).
stop: The stopping value of the sequence. The stop value is exclusive; it is not included in the result.
step: The step increment between sequence values (default 1).

Date Functions

# op.now() · Source

Returns the current time as the number of milliseconds elapsed since January 1, 1970 00:00:00 UTC.

# op.timestamp(date) · Source

Returns the timestamp for a date as the number of milliseconds elapsed since January 1, 1970 00:00:00 UTC.

date: The input Date value.

# op.datetime([year, month, date, hours, minutes, seconds, milliseconds]) · Source

Creates and returns a new Date value. If no arguments are provided, the current date and time are used.

year: The year.
month The (zero-based) month (default 0).
date The date within the month (default 1).
hours: The hour within the day (default 0).
minutes: The minute within the hour (default 0).
seconds: The second within the minute (default 0).
milliseconds: The milliseconds within the second (default 0).

# op.year(date) · Source

Returns the year of the specified date according to local time.

date: The input Date or timestamp value.

# op.quarter(date) · Source

Returns the zero-based quarter of the specified date according to local time.

date: The input Date or timestamp value.

# op.month(date) · Source

Returns the zero-based month of the specified date according to local time. A value of 0 indicates January, 1 indicates February, and so on.

date: The input Date or timestamp value.

# op.week(date[, firstday]) · Source

Returns the week number of the year (0-53) for the specified date according to local time. By default, Sunday is used as the first day of the week. All days in a new year preceding the first Sunday are considered to be in week 0.

date: The input Date or timestamp value.
firstday: Number of first day of the week (default 0 for Sunday, 1 for Monday and so on).

# op.date(date) · Source

Returns the date (day of month) of the specified date according to local time.

date: The input Date or timestamp value.

# op.dayofyear(date) · Source

Returns the day of the year (1-366) of the specified date according to local time.

date: The input Date or timestamp value.

# op.dayofweek(date) · Source

Returns the Sunday-based day of the week (0-6) of the specified date according to local time. A value of 0 indicates Sunday, 1 indicates Monday, and so on.

date: The input Date or timestamp value.

# op.hours(date) · Source

Returns the hour of the day for the specified date according to local time.

date: The input Date or timestamp value.

# op.minutes(date) · Source

Returns the minute of the hour for the specified date according to local time.

date: The input Date or timestamp value.

# op.seconds(date) · Source

Returns the seconds of the minute for the specified date according to local time.

date: The input Date or timestamp value.

# op.milliseconds(date) · Source

Returns the milliseconds of the second for the specified date according to local time.

date: The input Date or timestamp value.

# op.utcdatetime([year, month, date, hours, minutes, seconds, milliseconds]) · Source

Creates and returns a new Date value using Coordinated Universal Time (UTC). If no arguments are provided, the current date and time are used.

year: The year.
month The (zero-based) month (default 0).
date The date within the month (default 1).
hours: The hour within the day (default 0).
minutes: The minute within the hour (default 0).
seconds: The second within the minute (default 0).
milliseconds: The milliseconds within the second (default 0).

# op.utcyear(date) · Source

Returns the year of the specified date according to Coordinated Universal Time (UTC).

date: The input Date or timestamp value.

# op.utcquarter(date) · Source

Returns the zero-based quarter of the specified date according to Coordinated Universal Time (UTC).

date: The input Date or timestamp value.

# op.utcmonth(date) · Source

Returns the zero-based month of the specified date according to Coordinated Universal Time (UTC). A value of 0 indicates January, 1 indicates February, and so on.

date: The input Date or timestamp value.

# op.utcweek(date[, firstday]) · Source

Returns the week number of the year (0-53) for the specified date according to Coordinated Universal Time (UTC). By default, Sunday is used as the first day of the week. All days in a new year preceding the first Sunday are considered to be in week 0.

date: The input Date or timestamp value.
firstday: Number of first day of the week (default 0 for Sunday, 1 for Monday and so on).

# op.utcdate(date) · Source

Returns the date (day of month) of the specified date according to Coordinated Universal Time (UTC).

date: The input Date or timestamp value.

# op.utcdayofyear(date) · Source

Returns the day of the year (1-366) of the specified date according to Coordinated Universal Time (UTC).

date: The input Date or timestamp value.

# op.utcdayofweek(date) · Source

Returns the Sunday-based day of the week (0-6) of the specified date according to Coordinated Universal Time (UTC). A value of 0 indicates Sunday, 1 indicates Monday, and so on.

date: The input Date or timestamp value.

# op.utchours(date) · Source

Returns the hour of the day for the specified date according to Coordinated Universal Time (UTC).

date: The input Date or timestamp value.

# op.utcminutes(date) · Source

Returns the minute of the hour for the specified date according to Coordinated Universal Time (UTC).

date: The input Date or timestamp value.

# op.utcseconds(date) · Source

Returns the seconds of the minute for the specified date according to Coordinated Universal Time (UTC).

date: The input Date or timestamp value.

# op.utcmilliseconds(date) · Source

Returns the milliseconds of the second for the specified date according to Coordinated Universal Time (UTC).

date: The input Date or timestamp value.

# op.format_date(date[, shorten]) · Source

Returns an ISO 8601 formatted string for the given date in local timezone. The resulting string is compatible with parse_date and JavaScript’s built-in Date.parse.

date: The input Date or timestamp value.
shorten: A boolean flag (default false) indicating if the formatted string should be shortened if possible. For example, the local date 2001-01-01 will shorten from "2001-01-01T00:00:00.000" to "2001-01-01T00:00".

# op.format_utcdate(date[, shorten]) · Source

Returns an ISO 8601 formatted string for the given date in Coordinated Universal Time (UTC). The resulting string is compatible with parse_date and JavaScript’s built-in Date.parse.

date: The input Date or timestamp value.
shorten: A boolean flag (default false) indicating if the formatted string should be shortened if possible. For example, the UTC date 2001-01-01 will shorten from "2001-01-01T00:00:00.000Z" to "2001-01-01".

JSON Functions

Functions for parsing and generating strings formatted using JavaScript Object Notation (JSON).

# op.parse_json(value) · Source

Parses a string value in JSON format, constructing the JavaScript value or object described by the string.

value: The input string value.

# op.to_json(value) · Source

Converts a JavaScript object or value to a JSON string.

value: The value to convert to a JSON string.

Math Functions

# op.bin(value, min, max, step[, offset]) · Source

Truncate a value to a bin boundary. Useful for creating equal-width histograms. Values outside the [min, max] range will be mapped to -Infinity (value < min) or +Infinity (value > max).

value: The number value to bin.
min: The minimum bin boundary value.
max: The maximum bin boundary value.
step: The step size between bin boundaries.
offset: Offset in steps (default 0) by which to adjust the returned bin value. An offset of 1 returns the next boundary.

# op.random() · Source

Return a random floating point number between 0 (inclusive) and 1 (exclusive). By default uses Math.random. Use the seed method to instead use a seeded random number generator.

# op.is_nan(value) · Source

Tests if the input value is not a number (NaN); equivalent to Number.isNaN. The method will return true only if the input value is an actual numeric NaN value; it will return false for other types (booleans, strings, etc.).

value: The value to test.

Examples

op.is_nan(NaN) // true
op.is_nan(0/0) // true
op.is_nan(op.sqrt(-1)) // true

op.is_nan('foo') // false
op.is_nan(+'foo') // true, coerce to number first

op.is_nan(true) // false
op.is_nan(+true) // false, booleans coerce to numbers

op.is_nan(undefined) // false
op.is_nan(+undefined) // true, coerce to number first

op.is_nan(null) // false
op.is_nan(+null) // false, null coerces to zero

# op.is_finite(value) · Source

Tests if the input value is finite; equivalent to Number.isFinite.

value: The value to test.

# op.abs(value) · Source

Returns the absolute value of the input value; equivalent to Math.abs.

value: The input number value.

# op.cbrt(value) · Source

Returns the cube root value of the input value; equivalent to Math.cbrt.

value: The input number value.

# op.ceil(value) · Source

Returns the ceiling of the input value, the nearest integer equal to or greater than the input; equivalent to Math.ceil.

value: The input number value.

# op.clz32(value) · Source

Returns the number of leading zero bits in the 32-bit binary representation of a number value; equivalent to Math.clz32.

value: The input number value.

# op.exp(value) · Source

Returns e^value, where e is Euler’s number, the base of the natural logarithm; equivalent to Math.exp.

value: The input number value.

# op.expm1(value) · Source

Returns e^value - 1, where e is Euler’s number, the base of the natural logarithm; equivalent to Math.expm1.

value: The input number value.

# op.floor(value) · Source

Returns the floor of the input value, the nearest integer equal to or less than the input; equivalent to Math.floor.

value: The input number value.

# op.fround(value) · Source

Returns the nearest 32-bit single precision float representation of the input number value; equivalent to Math.fround. Useful for translating between 64-bit Number values and values from a Float32Array.

value: The input number value.

# op.greatest(…values) · Source

Returns the greatest (maximum) value among the input values; equivalent to Math.max. This is not an aggregate function, see op.max to compute a maximum value across multiple rows.

values: Zero or more input values.

# op.least(…values) · Source

Returns the least (minimum) value among the input values; equivalent to Math.min. This is not an aggregate function, see op.min to compute a minimum value across multiple rows.

values: Zero or more input values.

# op.log(value) · Source

Returns the natural logarithm (base e) of a number value; equivalent to Math.log.

value: The input number value.

# op.log10(value) · Source

Returns the base 10 logarithm of a number value; equivalent to Math.log10.

value: The input number value.

# op.log1p(value) · Source

Returns the natural logarithm (base e) of 1 + a number value; equivalent to Math.log1p.

value: The input number value.

# op.log2(value) · Source

Returns the base 2 logarithm of a number value; equivalent to Math.log2.

value: The input number value.

# op.pow(base, exponent) · Source

Returns the base raised to the exponent power, that is, base^exponent; equivalent to Math.pow.

base: The base number value.
exponent: The exponent number value.

# op.round(value) · Source

Returns the value of a number rounded to the nearest integer; ; equivalent to Math.round.

value: The input number value.

# op.sign(value) · Source

Returns either a positive or negative +/- 1, indicating the sign of the input value; equivalent to Math.sign.

value: The input number value.

# op.sqrt(value) · Source

Returns the square root of the input value; equivalent to Math.sqrt.

value: The input number value.

# op.trunc(value) · Source

Returns the integer part of a number by removing any fractional digits; equivalent to Math.trunc.

value: The input number value.

# op.degrees(radians) · Source

Converts the input radians value to degrees.

value: The input number value in radians.

# op.radians(radians) · Source

Converts the input degrees value to radians.

value: The input number value in degrees.

# op.acos(value) · Source

Returns the arc-cosine (in radians) of a number value; equivalent to Math.acos.

value: The input number value.

# op.acosh(value) · Source

Returns the hyperbolic arc-cosine of a number value; equivalent to Math.acosh.

value: The input number value.

# op.asin(value) · Source

Returns the arc-sine (in radians) of a number value; equivalent to Math.asin.

value: The input number value.

# op.asinh(value) · Source

Returns the hyperbolic arc-sine of a number value; equivalent to Math.asinh.

value: The input number value.

# op.atan(value) · Source

Returns the arc-tangent (in radians) of a number value; equivalent to Math.atan.

value: The input number value.

# op.atan2(y, x) · Source

Returns the angle in the plane (in radians) between the positive x-axis and the ray from (0, 0) to the point (x, y); ; equivalent to Math.atan2.

y: The y coordinate of the point.
x: The x coordinate of the point.

# op.atanh(value) · Source

Returns the hyperbolic arc-tangent of a number value; equivalent to Math.atanh.

value: The input number value.

# op.cos(value) · Source

Returns the cosine (in radians) of a number value; equivalent to Math.cos.

value: The input number value.

# op.cosh(value) · Source

Returns the hyperbolic cosine of a number value; equivalent to Math.cosh.

value: The input number value.

# op.sin(value) · Source

Returns the sine (in radians) of a number value; equivalent to Math.sin.

value: The input number value.

# op.sinh(value) · Source

Returns the hyperbolic sine of a number value; equivalent to Math.sinh.

value: The input number value.

# op.tan(value) · Source

Returns the tangent (in radians) of a number value; equivalent to Math.tan.

value: The input number value.

# op.tanh(value) · Source

Returns the hyperbolic tangent of a number value; equivalent to Math.tanh.

value: The input number value.

Object Functions

# op.equal(a, b) · Source

Compare two values for equality, using join semantics in which null !== null. If the inputs are object-valued, a deep equality check of array entries or object key-value pairs is performed. The method is helpful within custom join condition expressions.

a: The first input to compare.
b: The second input to compare.

# op.has(object, key) · Source

Returns a boolean indicating whether the object has the specified key as its own property (as opposed to inheriting it). If the object is a Map or Set instance, the has method will be invoked directly on the object, otherwise Object.hasOwnProperty is used.

object: The object, Map, or Set to test for property membership.
property: The string property name to test for.

# op.keys(object) · Source

Returns an array of a given object’s own enumerable property names. If the object is a Map instance, the keys method will be invoked directly on the object, otherwise Object.keys is used.

object: The input object or Map value.

# op.values(object) · Source

Returns an array of a given object’s own enumerable property values. If the object is a Map or Set instance, the values method will be invoked directly on the object, otherwise Object.values is used.

object: The input object, Map, or Set value.

# op.entries(object) · Source

Returns an array of a given object’s own enumerable string-keyed property [key, value] pairs. If the object is a Map or Set instance, the entries method will be invoked directly on the object, otherwise Object.entries is used.

object: The input object, Map, or Set value.

# op.object(entries) · Source

Returns a new object given an iterable entries argument of [key, value] pairs. This method is Arquero’s version of the standard Object.fromEntries method.

entries: An iterable collection of [key, value] pairs, such as an array of two-element arrays or a Map.

# op.recode(value, map[, fallback]) · Source

Recodes an input value to an alternative value, based on a provided value map object. If a fallback value is specified, it will be returned when the input value is not found in the map; otherwise, the input value is returned unchanged.

value: The value to recode. The value must be safely coercible to a string for lookup against the value map.
map: An object or Map with input values for keys and recoded values for values. If a non-Map object, only the object’s own properties will be considered; inherited properties on the prototype chain are ignored.
fallback: An optional fallback value to use if the input value is not found in the value map. If a fallback is not specified, the input value will be returned unchanged when not found in the map.

Examples

// recode values in a derive statement
table.derive({ val: d => op.recode(d.val, { 'opt:a': 'A', 'opt:b': 'B' }) })

// define value map externally, bind as parameter
const map = { 'opt:a': 'A', 'opt:b': 'B' };
table
  .params({ map })
  .derive({ val: (d, $) => op.recode(d.val, $.map, '?') })

// using a Map object, bind as parameter
const map = new Map().set('opt:a', 'A').set('opt:b', 'B');
table
  .params({ map })
  .derive({ val: (d, $) => op.recode(d.val, $.map, '?') })

# op.row_object([…columns]) · Source

Generate a new object containing the data for the current table row. The new object maps from column name keys to table values for the current row. The optional columns list indicates which columns to include in the object; if unspecified, all columns are included by default.

This method can only be invoked within a single-table expression. Calling this method in a multi-table expression (such as for a join) results in an error. An error will also result if any provided column names are specified using dynamic lookups of table column values.

columns: A list of column names or indices to include in the object.

Examples

aq.table({ a: [1, 3], b: [2, 4] })
  .derive({ row: op.row_object() })
  .get('row', 0); // { a: 1, b: 2 }

// rollup a table into an array of row objects
table.rollup({ rows: d => op.array_agg(op.row_object()) })

String Functions

# op.parse_date(value) · Source

Parses a string value and returns a Date instance. Beware: this method uses JavaScript’s Date.parse() functionality, which is inconsistently implemented across browsers. That said, ISO 8601 formatted strings such as those produced by format_date and format_utcdate should be supported across platforms. Note that “bare” ISO date strings such as "2001-01-01" are interpreted by JavaScript as indicating midnight of that day in Coordinated Universal Time (UTC), not local time. To indicate the local timezone, an ISO string can include additional time components and no Z suffix: "2001-01-01T00:00".

value: The input value.

# op.parse_float(value) · Source

Parses a string value and returns a floating point number.

value: The input value.

# op.parse_int(value[, radix]) · Source

Parses a string value and returns an integer of the specified radix (the base in mathematical numeral systems).

value: The input value.
radix: An integer between 2 and 36 that represents the radix (the base in mathematical numeral systems) of the string. Be careful: this does not default to 10! If radix is undefined, 0, or unspecified, JavaScript assumes the following: If the input string begins with "0x" or "0X" (a zero, followed by lowercase or uppercase X), the radix is assumed to be 16 and the rest of the string is parsed as a hexidecimal number. If the input string begins with "0" (a zero), the radix is assumed to be 8 (octal) or 10 (decimal). Exactly which radix is chosen is implementation-dependent. If the input string begins with any other value, the radix is 10 (decimal).

# op.endswith(value, search[, length]) · Source

Determines whether a string value ends with the characters of a specified search string, returning true or false as appropriate.

value: The input string value.
search: The search string to test for.
length: If provided, used as the length of value (default value.length).

# op.match(value, regexp[, index]) · Source

Retrieves the result of matching a string value against a regular expression regexp. If no index is specified, returns an array whose contents depend on the presence or absence of the regular expression global (g) flag, or null if no matches are found. If the g flag is used, all results matching the complete regular expression will be returned, but capturing groups will not. If the g flag is not used, only the first complete match and its related capturing groups are returned.

If specified, the index looks up a value of the resulting match. If index is a number, the corresponding index of the result array is returned. If index is a string, the value of the corresponding named capture group is returned, or null if there is no such group.

value: The input string value.
regexp: The regular expression to match against.

Examples

// returns ['1', '2', '3']
op.match('1 2 3', /\d+/g)

// returns '2' (index into match array)
op.match('1 2 3', /\d+/g, 1)

// returns '3' (index of capture group)
op.match('1 2 3', /\d+ \d+ (\d+)/, 1)

// returns '2' (named capture group)
op.match('1 2 3', /\d+ (?<digit>\d+)/, 'digit')

# op.normalize(value[, form]) · Source

Returns the Unicode normalization form of the string value.

value: The input string to normalize.
form: The Unicode normalization form, one of 'NFC' (default, canonical decomposition, followed by canonical composition), 'NFD' (canonical decomposition), 'NFKC' (compatibility decomposition, followed by canonical composition), or 'NFKD' (compatibility decomposition).

# op.padend(value, length[, fill]) · Source

Pad a string value with a given fill string (applied from the end of value and repeated, if needed) so that the resulting string reaches a given length.

value: The input string to pad.
length: The length of the resulting string once the value string has been padded. If the length is lower than value.length, the value string will be returned as-is.
fill: The string to pad the value string with (default ''). If fill is too long to stay within the target length, it will be truncated: for left-to-right languages the left-most part and for right-to-left languages the right-most will be applied.

# op.padstart(value, length[, fill]) · Source

Pad a string value with a given fill string (applied from the start of value and repeated, if needed) so that the resulting string reaches a given length.

value: The input string to pad.
length: The length of the resulting string once the value string has been padded. If the length is lower than value.length, the value string will be returned as-is.
fill: The string to pad the value string with (default ''). If fill is too long to stay within the target length, it will be truncated: for left-to-right languages the left-most part and for right-to-left languages the right-most will be applied.

# op.lower(value) · Source

Returns the string value converted to lower case.

value: The input string value.

# op.upper(value) · Source

Returns the string value converted to upper case.

value: The input string value.

# op.repeat(value, number) · Source

Returns a new string which contains the specified number of copies of the value string concatenated together.

value: The input string to repeat.
number: An integer between 0 and +Infinity, indicating the number of times to repeat the string.

# op.replace(value, pattern, replacement) · Source

Returns a new string with some or all matches of a pattern replaced by a replacement. The pattern can be a string or a regular expression, and the replacement must be a string. If pattern is a string, only the first occurrence will be replaced; to make multiple replacements, use a regular expression pattern with a g (global) flag.

value: The input string value.
pattern: The pattern string or regular expression to replace.
replacement: The replacement string to use.

# op.split(value, separator[, limit]) · Source

Divides a string value into an ordered list of substrings based on a separator pattern, puts these substrings into an array, and returns the array.

value: The input string value.
separator: A string or regular expression pattern describing where each split should occur.
limit: An integer specifying a limit on the number of substrings to be included in the array.

# op.startswith(value, search[, position]) · Source

Determines whether a string value starts with the characters of a specified search string, returning true or false as appropriate.

value: The input string value.
search: The search string to test for.
position: The position in the value string at which to begin searching (default 0).

# op.substring(value[, start, end]) · Source

Returns the part of the string value between the start and end indexes, or to the end of the string.

value: The input string value.
start: The index of the first character to include in the returned substring (default 0).
end: The index of the first character to exclude from the returned substring (default value.length).

# op.trim(value) · Source

Returns a new string with whitespace removed from both ends of the input value string. Whitespace in this context is all the whitespace characters (space, tab, no-break space, etc.) and all the line terminator characters (LF, CR, etc.).

value: The input string value to trim.

Aggregate Functions

Aggregate table expression functions for summarizing values. If invoked outside a table expression context, column (field) inputs must be column name strings, and the operator will return a corresponding table expression.

# op.any(field) · Source

Aggregate function returning an arbitrary observed value (typically the first encountered).

field: The data column or derived field.

# op.bins(field[, maxbins, nice, minstep, step]) · Source

Aggregate function for calculating a binning scheme in terms of the minimum bin boundary, maximum bin boundary, and step size.

field: The data column or derived field.
maxbins: The maximum number of allowed bins (default 15).
nice: Boolean flag (default true) indicating if the bin min and max should snap to “nice” human-friendly values such as multiples of 10.
minstep: The minimum allowed step size between bins.
step: The exact step size to use between bins. If specified, the maxbins and minstep arguments are ignored.

# op.count() · Source

Aggregate function to count the number of records (rows).

# op.distinct(field) · Source

Aggregate function to count the number of distinct values.

field: The data column or derived field.

# op.valid(field) · Source

Aggregate function to count the number of valid values. Invalid values are null, undefined, or NaN.

field: The data column or derived field.

# op.invalid(field) · Source

Aggregate function to count the number of invalid values. Invalid values are null, undefined, or NaN.

field: The data column or derived field.

# op.max(field) · Source

Aggregate function for the maximum value. For a non-aggregate version, see op.greatest.

field: The data column or derived field.

# op.min(field) · Source

Aggregate function for the minimum value. For a non-aggregate version, see op.least.

field: The data column or derived field.

# op.sum(field) · Source

Aggregate function to sum values.

field: The data column or derived field.

# op.product(field) · Source

Aggregate function to multiply values.

field: The data column or derived field.

# op.mean(field) · Source

Aggregate function for the mean (average) value. This operator is a synonym for average.

field: The data column or derived field.

# op.average(field) · Source

Aggregate function for the average (mean) value. This operator is a synonym for mean.

field: The data column or derived field.

# op.mode(field) · Source

Aggregate function to determine the mode (most frequent) value.

field: The data column or derived field.

# op.median(field) · Source

Aggregate function for the median value. This operation is a shorthand for the quantile value at p = 0.5.

field: The data column or derived field.

# op.quantile(field, p) · Source

Aggregate function to compute the quantile boundary of a data field for a probability threshold. The median is the value of quantile at p = 0.5.

field: The data column or derived field.
p: The probability threshold.

# op.stdev(field) · Source

Aggregate function for the sample standard deviation.

field: The data column or derived field.

# op.stdevp(field) · Source

Aggregate function for the population standard deviation.

field: The data column or derived field.

# op.variance(field) · Source

Aggregate function for the sample variance.

field: The data column or derived field.

# op.variancep(field) · Source

Aggregate function for the population variance.

# op.corr(field1, field2) · Source

Aggregate function for the product-moment correlation between two variables. To instead compute a rank correlation, compute the average ranks for each variable and then apply this function to the result.

field1: The first data column or derived field.
field2: The second data column or derived field.

# op.covariance(field1, field2) · Source

Aggregate function for the sample covariance between two variables.

field1: The first data column or derived field.
field2: The second data column or derived field.

# op.covariancep(field1, field2) · Source

Aggregate function for the population covariance between two variables.

field1: The first data column or derived field.
field2: The second data column or derived field.

# op.array_agg(field) · Source

Aggregate function to collect an array of field values. The resulting aggregate is an array (one per group) containing all observed values. The order of values is sensitive to any orderby criteria.

field: The data column or derived field.

Examples

aq.table({ v: [1, 2, 3, 1] })
  .rollup({ a: op.array_agg('v') }) // a: [ [1, 2, 3, 1] ]

# op.array_agg_distinct(field) · Source

Aggregate function to collect an array of distinct (unique) field values. The resulting aggregate is an array (one per group) containing all unique values. The order of values is sensitive to any orderby criteria.

field: The data column or derived field.

Examples

aq.table({ v: [1, 2, 3, 1] })
  .rollup({ a: op.array_agg_distinct('v') }) // a: [ [1, 2, 3] ]

# op.object_agg(key, value) · Source

Aggregate function to create an object given input key and value fields. The resulting aggregate is an object (one per group) with keys and values defined by the input fields. For any keys that occur multiple times in a group, the most recently observed value is used. The order in which keys and values are observed is sensitive to any orderby criteria.

key: The object key field, should be a string or string-coercible value.
value The object value field.

Examples

aq.table({ k: ['a', 'b', 'a'], v: [1, 2, 3] })
  .rollup({ o: op.object_agg('k', 'v') }) // o: [ { a: 3, b: 2 } ]

# op.map_agg(key, value) · Source

Aggregate function to create a Map given input key and value fields. The resulting aggregate is a Map (one per group) with keys and values defined by the input fields. For any keys that occur multiple times in a group, the most recently observed value is used. The order in which keys and values are observed is sensitive to any orderby criteria.

key: The key field.
value The value field.

Examples

aq.table({ k: ['a', 'b', 'a'], v: [1, 2, 3] })
  .rollup({ m: op.map_agg('k', 'v') }) // m: [ new Map([['a', 3], ['b', 2]]) ]

# op.entries_agg(key, value) · Source

Aggregate function to create an array in the style of Object.entries given input key and value fields. The resulting aggregate is an array (one per group) with [key, value] arrays defined by the input fields, and may include duplicate keys. The order of entries is sensitive to any orderby criteria.

key: The key field.
value The value field.

Examples

aq.table({ k: ['a', 'b', 'a'], v: [1, 2, 3] })
  .rollup({ e: op.entries_agg('k', 'v') }) // e: [ [['a', 1], ['b', 2], ['a', 3]] ]

Window Functions

Window table expression functions applicable over ordered table rows. If invoked outside a table expression context, column (field) inputs must be column name strings, and the operator will return a corresponding table expression.

# op.row_number() · Source

Window function to assign consecutive row numbers, starting from 1.

# op.rank() · Source

Window function to assign a rank to each value in a group, starting from 1. Peer values are assigned the same rank. Subsequent ranks reflect the number of prior values: if the first two values tie for rank 1, the third value is assigned rank 3.

# op.avg_rank() · Source

Window function to assign a fractional (average) rank to each value in a group, starting from 1. Peer values are assigned the average of their indices: if the first two values tie, both will be assigned rank 1.5.

# op.dense_rank() · Source

Window function to assign a dense rank to each value in a group, starting from 1. Peer values are assigned the same rank. Subsequent ranks do not reflect the number of prior values: if the first two values tie for rank 1, the third value is assigned rank 2.

# op.percent_rank() · Source

Window function to assign a percentage rank to each value in a group. The percent is calculated as (rank - 1) / (group_size - 1).

# op.cume_dist() · Source

Window function to assign a cumulative distribution value between 0 and 1 to each value in a group.

# op.ntile(num) · Source

Window function to assign a quantile (e.g., percentile) value to each value in a group. Accepts an integer parameter indicating the number of buckets to use (e.g., 100 for percentiles, 5 for quintiles).

num: The number of buckets for ntile calculation.

# op.lag(field[, offset, defaultValue]) · Source

Window function to assign a value that precedes the current value by a specified number of positions. If no such value exists, returns a default value instead.

field: The data column or derived field.
offset: The lag offset (default 1) from the current value.
defaultValue: The default value (default undefined).

# op.lead(field[, offset, defaultValue]) · Source

Window function to assign a value that follows the current value by a specified number of positions. If no such value exists, returns a default value instead.

field: The data column or derived field.
offset: The lead offset (default 1) from the current value.
defaultValue: The default value (default undefined).

# op.first_value(field) · Source

Window function to assign the first value in a sliding window frame.

field: The data column or derived field.

# op.last_value(field) · Source

Window function to assign the last value in a sliding window frame.

field: The data column or derived field.

# op.nth_value(field[, nth]) · Source

Window function to assign the nth value in a sliding window frame (counting from 1), or undefined if no such value exists.

field: The data column or derived field.
nth: The nth position, starting from 1.

# op.fill_down(field[, defaultValue]) · Source

Window function to fill in missing values with preceding values. Returns the value at the current window position if it is valid (not null, undefined, or NaN), otherwise returns the first preceding valid value. If no such value exists, returns the default value.

field: The data column or derived field.
defaultValue: The default value (default undefined).

# op.fill_up(field[, defaultValue]) · Source

Window function to fill in missing values with subsequent values. Returns the value at the current window position if it is valid (not null, undefined, or NaN), otherwise returns the first subsequent valid value. If no such value exists, returns the default value.

field: The data column or derived field.
defaultValue: The default value (default undefined).

This site is open source. Improve this page.