Aggregator
Aggregator用于执行聚合操作。分组和聚合通过两个接口 Aggregate 和 GroupBy 来实现。有一些针对标准聚合的预定义实现:最大值、最小值、总和、平均值、计数等。并且可以定义用户自定义聚合。GroupBy 接口应由开发人员实现(使用匿名类会很方便),它定义了查询的聚合方式;即如何将输入数据拆分为用于计算的分组。
Aggregator 使用一个映射将聚合状态与组关联起来。此映射作为聚合的结果返回。聚合器可以使用有序或无序映射(即 TreeMap 或 HashMap)。有序映射会按照分组值的升序返回结果。例如,考虑以下表格:
class Quote
{
@Indexable
public long date;
public float open;
public float close;
public float low;
public float high;
public int volume;
};
现在要执行类似“查询自 1990 年以来 IBM 每月低价格与高价格差值的标准差”的查询,我们会实现如下代码:
Cursor<Quote> cursor = new Cursor<Quote>(con, Quote.class, "date");
if (cursor.search(Operation.GreaterOrEquals, (new Date(1990, 0, 1)).getTime()))
{
Map<Object,Aggregator.Aggregate> result = Aggregator.<Quote>aggregate(cursor,
new Aggregator.GroupBy<Quote>()
{
public Aggregator.Aggregate getAggregate()
{
return new Aggregator.DevAggregate();
}
public Object getKey(Quote quote)
{
return (new Date(quote.date)).getMonth();
}
public Object getValue(Quote quote)
{
return quote.high - quote.low;
}
public Aggregator.FilterResult filter(Quote quote)
{
return Filter.Use;
}
}, true);
for (Map.Entry<Object,Aggregator.Aggregate> pair : result.entrySet())
{
System.out.println("Group " + pair.getKey() + "->" + pair.getValue().result());
}
}
类定义
public class Aggregator
{
...
public enum FilterResult
{
Use,
Skip,
Stop
};
public interface Aggregate\<T\> {…}
public interface GroupBy\<T\> {…}
public static \<T\> Map<Object,Aggregate> ... {…}
public static void merge(Map<Object,Aggregate> dst, Map<Object,Aggregate> src) {…}
public static class TopAggregate implements Aggregate<Comparable> {…}
public static class MaxAggregate implements Aggregate<Comparable> {…}
public static class MinAggregate implements Aggregate<Comparable> {…}
public static class RealSumAggregate implements Aggregate<Number> {…}
public static class IntegerSumAggregate implements Aggregate<Number> {…}
public static class AvgAggregate implements Aggregate<Number> {…}
public static class PrdAggregate implements Aggregate<Number> {…}
public static class VarAggregate implements Aggregate<Number> {…}
public static class DevAggregate extends VarAggregate {…}
public static class CountAggregate implements Aggregate {…}
public static class DistinctCountAggregate implements Aggregate {…}
public static class RepeatCountAggregate implements Aggregate {…}
public static class ApproxDistinctCountAggregate implements Aggregate {…}
public static class FirstAggregate implements Aggregate {…}
public static class LastAggregate implements Aggregate {…}
public static class CompoundAggregate implements Aggregate {…}
};
方法
FilterResult
enum FilterResult
用于控制查询结果过滤的枚举常量:
public enum FilterResult
{
Use,
Skip,
Stop
};
Aggregate
Aggregate<T>:由所有标准聚合实现,并可用于定义自定义聚合
GroupBy
GroupBy<T>:由所有标准聚合实现,并可用于定义自定义聚合
Map
<T> Map<Object,Aggregate> aggregate(Iterable<T> iterable, GroupBy<T> groupBy):执行聚合操作;
参数:
- iterable:聚合对象的集合
- groupBy:聚合操作
- orderByKey:指定分组时是否应使用有序映射(TreeMap)(在这种情况下,分组键应提供比较操作)
merge
void merge(Map<Object,Aggregate> dst, Map<Object,Aggregate> src):合并两个聚合结果。此方法将 dst 中的聚合状态与 src 中的聚合状态相结合;