聚合分组语法格式
{ $group: { _id: <expression>, <field1>: { <accumulator1> : <expression1> }, ... } }
_id 字段是必须要的,如果不指定字段进行分组则用 null,表示不分组的统计;
分组内存使用限制是100M,默认情况下如果超过了限制100M则会出现错误。如果想对超过100M的大数据进行处理,可以使用 allowDiskUse 选项来进行分组时写到磁盘临时文件中处理。
聚合查询:https://docs.mongodb.com/manual/reference/method/db.collection.aggregate/#db.collection.aggregate
分组内存使用超过限制时错误:
{ "message" : "Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.", "stack" : "MongoError: Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in." + "at queryCallback (/tmp/.mount_nosqlbO7RhZG/app/resources/app.asar/node_modules/mongodb-core/lib/cursor.js:223:25)" + "at /tmp/.mount_nosqlbO7RhZG/app/resources/app.asar/node_modules/mongodb-core/lib/connection/pool.js:541:18" + "at _combinedTickCallback (internal/process/next_tick.js:131:7)" + "at process._tickCallback (internal/process/next_tick.js:180:9)", "name" : "MongoError", "ok" : 0, "errmsg" : "Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.", "code" : 16945, "codeName" : "Location16945" }
解决错误,使用使用 allowDiskUse 选项来进行分组聚合查询:
db.stocks.aggregate( [ { $project : { cusip: 1, date: 1, price: 1, _id: 0 } }, { $sort : { cusip : 1, date: 1 } } ], { allowDiskUse: true } );
聚合分组案例
准备测试数据:
db.sales.insert([ { "_id" : 1, "item" : "abc", "price" : 10, "quantity" : 2, "date" : ISODate("2014-03-01T08:00:00Z") }, { "_id" : 2, "item" : "jkl", "price" : 20, "quantity" : 1, "date" : ISODate("2014-03-01T09:00:00Z") }, { "_id" : 3, "item" : "xyz", "price" : 5, "quantity" : 10, "date" : ISODate("2014-03-15T09:00:00Z") }, { "_id" : 4, "item" : "xyz", "price" : 5, "quantity" : 20, "date" : ISODate("2014-04-04T11:21:39.736Z") }, { "_id" : 5, "item" : "abc", "price" : 10, "quantity" : 10, "date" : ISODate("2014-04-04T21:23:13.331Z") } ]);
不指定字段进行分组
db.sales.aggregate( [ { $group : { _id : null, minPrice:{$min:"$price"}, //最小值 maxPrice:{$max:"$price"}, //最大值 avgPrice:{ $avg: "$price" }, //平均值 totalPrice: { $sum: { $multiply: [ "$price", "$quantity" ] } }, //总价 $sum(price * quantity) lastItem:{$last:"$item"}, //取最后一条 count: { $sum: 1 } // 总条目数 } } ] );
多字段分组统计
db.sales.aggregate( [ { $group : { _id : {"商品":"$item", "价格":"$price"}, "总条目数":{$sum:1} } } ] ); // item,price 多字段分组
根据日期年、月、日进行分组统计
db.sales.aggregate( [ { $group : { _id : { year: { $year: "$date" }, month: { $month: "$date" }, day: { $dayOfMonth: "$date" } }, totalPrice: { // $sum: 求和操作,$multiply:乘法操作 $sum: { $multiply: [ "$price", "$quantity" ] } }, averageQuantity: { $avg: "$quantity" }, //$avg:求平均值 count: { $sum: 1 } } } ] );
使用$push 聚合分组指定到列结果到数组中
db.books.insert([ { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }, { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }, { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }, { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }, { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 } ]); db.books.aggregate( [ { $group : { _id : "$author", books: { $push: "$title" } } } ] );
聚合分组使用 $$ROOT 系统变量
db.books.aggregate( [ { $group : { _id : "$author", books: { $push: "$$ROOT" } } } ] );//根据author分组,然后将对应的文档聚合到(books)数组中
参考:https://docs.mongodb.com/manual/reference/operator/aggregation/group/#pipe._S_group