About: This article introduces the Aggregate node, a Compile node within Construct.
Location: Node panel
Table of Contents
Feature Overview
The Aggregate node allows users to aggregate data into summary statistics. The following aggregates can be taken:
Min: Minimum value for a field
Max: Maximum value for a field
Sum: Sum of all non-missing values for a field
Mean: Average of all non-null values for a field
Count: Count of all non-missing values for field
Count Distinct: Count of the distinct number of values for a field
First: The first value it comes to in that field
Standard Deviation: The standard deviation of the numeric field
Aggregating your data can be a useful way to get a summary of an entire dataset, flatten a file (i.e. condense records down to one row per field category), bring data into a chart node, explore your data as you prepare it, or add data as a table to a report.
Configuring an Aggregate Node
Aggregates or summary statistics can be taken for the entire dataset which would produce an output with just one row of data. Alternately, the Aggregate node can be configured to group the data by the categories that exist within field(s) and then produce an aggregate for each category. This would produce an output with as many rows of data as unique values of the Aggregate By variable. If multiple Aggregate By variables are selected, the output will contain one row per combination of unique values for all Aggregate By variables.
Example
Consider this dataset containing several employees, the amount of each sale made and the month the sale occurred.
The Aggregate node can be used to summarize this dataset into several more useful outputs.
Option 1: Aggregate By the Month and take the Sum of the Sale Amount to produce an output with one row per month containing the sum of all sales for that month.
Option 2: Aggregate By the Employee_Name and take the Sum of the Sale Amount to produce an output with one row per employee containing the sum of all sales that person has made.
Option 3: Aggregate By both the Month and Employee_Name and take the Sum of the Sale Amount to produce an output with one row per employee-month combination containing the sum of all sales that person has made in that month. Additionally, taking the Count of the Sale Amount includes the number of sales each employee made during each month.
Comments
0 comments
Article is closed for comments.