group

Use the "group" directive to collect rows according to shared criteria. Use this directives sub-directives to perform convenient operations on each group.

Grouping rows in a worksheet based on common criteria is a powerful data processing technique. Users may group rows of data per specified criteria. Once rows are grouped, other operations can be applied to the group as an added convenience.

For example, consider a worksheet with columns "Account ID" and "Date". Use this directive to group the rows in this worksheet.

pebblestream:from(Accounts)
pebblestream:group("Account ID", "Date")
📘

Grouping Feature

All rows with matching "Account ID" and "Date" values are put next to each other. This often results in easier data processing.

📘

Grouping parameters are optional!

If no parameters are provided to the pebblestream:group() directive, it will behave as if all rows in the worksheet belong to a single group.

coalesce

Here is the sub-directive coalesce declaration.

pebblestream:coalesce()

After group calculations are performed, coalesce each group into a single row.

📘

If the pebblestream:coalesce directive is not specified, it is applied by default.

🚧

Coalesce is not compatible with ratio, balance, or merge sub-directives.

expand

Here is the sub-directive expand declaration

pebblestream:expand()

This maintains the original rows of the grouped worksheet and assigns the grouped result to each row. Use this to relate each row to the group's computed value.

For instance, to compute the relative size of a column's value to each member of the group, use the pebblestream:min directive with expand, then divide each row's column value by the group's minimum.

pebblestream:from("Customer Orders")
pebblestream:group("customer id")
pebblestream:min("order amount", "minimum order")
pebblestream:expand()

📘

The pebblestream:expand() directive facilitates matching of group results with the original rows of the group. This reduces the need for complicated INDEX-MATCHES.

sum

Use this directive to get the sum of the values in a column for rows grouped by criteria.

pebblestream:from(Amounts)
pebblestream:group("Last", "First")
pebblestream:sum("Amount"[, "Result"])

The "Amount" rows are summed across all rows with the same "Last" and "First" values. If "Result" is provided, a new column called "Result" is created with the value. Otherwise, the "Amount" column is updated with the sum.

Sum Example

Sum up all transfers for the each person in the Transactions worksheet.

FirstLastTransfer
JohnJones-60
AdamJones186
CathyAlbertson48794
JohnJones545
CathyAlbertson8777

Here is the directive that accomplishes this task.

pebblestream:from(Transactions)
pebblestream:group("Last", "First")
pebblestream:sum("Transfer")
pebblestream:coalesce()

This is the resulting worksheet.

FirstLastTransfer
JohnJones485
AdamJones186
CathyAlbertson57571

If the sub-directive "expand" is used instead of coalesce, the following is the resulting worksheet:

FirstLastTransfer
JohnJones485
JohnJones485
CathyAlbertson48794
CathyAlbertson48794
AdamJones186
📘

Computation Tip

It can be computationally convenient to not coalesce group results.

mult

Mult Example

pebblestream:mult("Factor"[, "Result"])

The "Factor" rows are multiplied together. If "Result" is provided, a new column called "Result" is created with the value. Otherwise, the "Factor" column is updated.

Discount IDAccount IDDiscount
110.9
220.8
310.95

Apply the following directive declaration:

pebblestream:from(Discounts)
pebblestream:group("Account ID")
pebblestream:mult("Discount")
pebblestream:coalesce()

To obtain the resulting worksheet.

Account IDDiscount
10.855
20.8
📘

Computation Tip

It can be computationally convenient to not coalesce group results.

min

The pebblestream:min() directive returns the minimum value for each group.

pebblestream:from(Discounts)
pebblestream:group("Account ID")
pebblestream:min("Discount")
pebblestream:coalesce()

max

The pebblestream:max() directive returns the maximum value for each group.

pebblestream:from(Discounts)
pebblestream:group("Account ID")
pebblestream:max("Discount")
pebblestream:coalesce()

average

The pebblestream:average directive returns the average value for each group.

pebblestream:from(Discounts)
pebblestream:group("Account ID")
pebblestream:average("Discount")
pebblestream:coalesce()

equal

The equal sub-directive group's rows that have equal values in the provided column.

pebblestream:equal("Amount"[, "Result"])

The "Amount" rows are checked for whether they are all equal. If "Result" is provided, a new column called "Result" is created with the value. Otherwise, the "Amount" column is updated.
Requires group.

Equal Example

Consider this worksheet Logins

Account IDAccount User# IP Address
12127.0.38.9
11127.0.5.8
11127.0.5.8
12127.0.48.2
21127.0.59.12

User the following directive declaration:

pebblestream:from(Logins)
pebblestream:group("Account ID", "Account User")
pebblestream:equal("IP Address", "Same IP?")
pebblestream:coalesce()

to obtain the worksheet Same IP

Account IDAccount UserSame IP?
12FALSE
11TRUE
21TRUE

ratio

The "ratio" sub-directive performs a typical operation amongst accountants and financial analysts. Given a group of numbers, determine the relative weight of each number in the group.

pebblestream:ratio("Amount"[, "Result"])

The "Amount" rows are summed together and the ratio of each row is calculated based on the total of the group's values. If "Result" is provided, a new column called "Result" is created with the row's group ratio. Otherwise, the "Amount" column is updated.

❗️

Important

The ratio directive cannot be used with coalesce.

Ratio Example

Consider the following worksheet of Transactions

FirstLastValue
JohnJones3
AdamJones5
JohnJones1
CathyAlbertson-1
CathyAlbertson1
JohnJones0

Using the following directive declaration

pebblestream:from(Transactions)
pebblestream:group("Last", "First")
pebblestream:ratio("Value", "Ratio")

will produce the following Ratio worksheet when placed in a comment in the header row.

FirstRatio
John0.75
John0.25
John0.0
Adam1.0
Cathy0
Cathy0

regex

❗️

This function is not supported at this time due to inconsistencies in behavior accross platforms. A future implementation of this directive is not guaranteed to be backwards compatible!

Checks whether every row in "Name" matches the provided regular expression. If "Result" is provided, a new column called "Result" is created with the result. Otherwise, the "Name" column is updated.

pebblestream:regex("Name", "[0-9]*MyRegex.*"[, "Result"])

Regex Example

Consider the worksheet Transactions,

Account IDAccount User #AmountTags
12894home verified
1145981office verified
1115home
125958verified home
21255verified mobile

Determine whether every transaction for a given user was verified using the following directive declaration:

pebblestream:from(Transactions)
pebblestream:group("Account ID", "Account User #")
pebblestream:regex("Tags", ".*\\bverified\\b.*", "User always verified?")

Placing this declaration in the comment section of the header section of the Verification Status worksheet produces the result below.

Account IDAccount User #AmountUser Always Verified?
12894TRUE
125958TRUE
1145981FALSE
1115FALSE
21255TRUE

balance

The "balance" sub-directive performs a typical operation amongst accountants and financial analysts. Given a total amount segmented into sub-amounts by a series of sharing percentages totaling 1, ensure that the sum of the rounded shares is equal to the rounded original total amount.

This directive requires three parameters: the first is the column header name of the allocated amount, the second is the name of the column that will receive the rounded and balanced amount, and the final parameter is the name of the column that will receive the adjustment added to the floored amount to achieve the balanced amount.

balanced = floor(allocation) + adjustment

Here is an example:

pebblestream:balance(“allocation”, “balanced”,”adjustment”)
❗️

Important

The balance directive cannot be used with coalesce.

Balance Example

yield

Calculates the yield of cash flows using the Newton-Raphson method. It takes five parameters, namely:

cash flow range: the header name in the from worksheet that contains the inflow (positive) and out flows (negative) payments.

period fractions: the header name containing the associated fractions of periods that associate with each payment. Note that the yield is calculated as if the payment was received at the end of the period.

guess: the initial yield guess

frequency: the number of periods in a year

output header name: the name of the column to place the computed yield in the target worksheet

For each group of cash flow payments provided in the cash flow column a single yield will be computed.

pebblestream:yield("cashflow","period fracs",guess,frequency,"computed-yield")

Example

npv

Given an interest rate, a series of payments, a series of corresponding period fractions, and a frequency, compute the net present values.

cash flow range: the header name in the from worksheet that contains the inflow (positive) and out flows (negative) payments.

period fractions: the header name containing the associated fractions of periods that associate with each payment. Note that the yield is calculated as if the payment was received at the end of the period.

interest rate: the rate of return

frequency: the number of periods in a year

output header name: the name of the column to place the computed npvs in the target worksheet

For each group of cash flow payments provided in the cash flow column a corresponding series of net present values will be computed.

pebblestream:npv("cashflow","period fracs",rate,frequency,"computed-npvs")

Example

pivot

Use the "pivot" sub-directive to reshape entity-attribute-value (EAV) data into a standard tabular worksheet. This is a sub-directive of group.

Data is often stored or received in a "triples" format — where each row represents a single attribute of an entity, with columns for the entity identifier, the attribute name (category), and the attribute value. The pivot directive transforms this vertical arrangement into a horizontal, tabular layout where each attribute becomes its own column.

This is especially useful when integrating with key-value stores, RDF triples, or any system that emits data in an entity-attribute-value pattern.

Declaration

pebblestream:pivot("Category Column", "Category Value", "Value Column", type[, default])

Parameters

ParameterRequiredDescription
Category ColumnYesThe name of the column containing attribute names (e.g., "category").
Category ValueYesThe specific attribute value to extract into a new column (e.g., "first name"). This becomes the header of the new column.
Value ColumnYesThe name of the column containing the data values (e.g., "value").
typeYesThe data type of the resulting column. Supported types: text, integer, number, boolean, date.
defaultNoA default value to use when the attribute's value is blank or the attribute is missing for a given entity. If omitted, blank values remain blank.
📘

Requires group

The pivot directive is a sub-directive of group. The group directive defines the entity identifier — the column(s) whose unique values determine each row in the pivoted output.

📘

Multiple pivot directives

Declare one pivot directive per attribute you want to extract. Each pivot call produces one new column in the resulting worksheet.

Example

Pivot Example

Consider the worksheet Triples, which stores person records in entity-attribute-value format.

idcategoryvalue
2age
1first nameJason
1age21
1married?FALSE
2first nameSandra
1last nameBourne
2last nameMonroe
2married?TRUE
3age234
3last nameNkrumah
4first nameKwesi
3first name
4married?TRUE
4age53

Note that the rows are in no particular order, some entities have blank values (id 2's age, id 3's first name), and some entities are missing attributes entirely (id 4 has no last name, id 3 has no married? attribute).

Apply the following directive declaration:

pebblestream:from("Triples")
pebblestream:group("id")
pebblestream:pivot("category", "first name", "value", text)
pebblestream:pivot("category", "age", "value", integer, 34)
pebblestream:pivot("category", "married?", "value", boolean, false)
pebblestream:pivot("category", "last name", "value", text)

To obtain the Pivoted worksheet:

idfirst nameagemarried?last name
2Sandra34TRUEMonroe
1Jason21FALSEBourne
3234FALSENkrumah
4Kwesi53TRUE
📘

How defaults work

When a default value is specified, it is used in two situations: when the category row exists but its value is blank (id 2's age was blank and received the default of 34), and when the category row is missing entirely for a given entity (id 3 had no "married?" row and received the default of false). When no default is specified, blank or missing values remain blank (id 3's first name, id 4's last name).

📘

Type enforcement

The type parameter ensures each pivoted column contains consistently typed data. This is important because values stored in EAV format are often untyped. Specifying integer, number, boolean, text, or date allows the Pebble Stream runtime to correctly interpret and validate values in the resulting column.

Coalesce behavior

The pivot directive implicitly coalesces grouped rows. Each entity identified by the group key produces exactly one row in the output — there is no need to add a separate pebblestream:coalesce() directive.

❗️

Important

The pivot directive cannot be used with the expand sub-directive. Each group always collapses to a single row.

Combining pivot with other group sub-directives

The pivot directive may be combined with other group sub-directives such as sum, min, max, and average to perform aggregations on the pivoted columns within the same directive declaration.

When to use pivot

The pivot directive is ideal when:

  • Data arrives from a key-value store, triple store, or EAV database schema.
  • An upstream system emits configuration or metadata as attribute-value pairs.
  • You need to normalize irregularly structured data into a standard tabular format for downstream pebble processing.

Example


group