MongoDB Count() versus aggregatie

.count() gaat veel sneller. U kunt de implementatie zien door te bellen met

// Note the missing parentheses at the end
db.collection.count

die de lengte van de cursor retourneert. van de standaardquery (if count() wordt aangeroepen zonder querydocument), wat op zijn beurt wordt geïmplementeerd als het retourneren van de lengte van de _id_ index, iirc.

Een aggregatie leest echter elk document en verwerkt het. Dit kan slechts halverwege dezelfde orde van grootte zijn met .count() wanneer je het doet over slechts zo'n 100k aan documenten (geven en nemen volgens je RAM).

Onderstaande functie is toegepast op een collectie met zo'n 12 miljoen inzendingen:

function checkSpeed(col,iterations){

  // Get the collection
  var collectionUnderTest = db[col];

  // The collection we are writing our stats to
  var stats = db[col+'STATS']

  // remove old stats
  stats.remove({})

  // Prevent allocation in loop
  var start = new Date().getTime()
  var duration = new Date().getTime()

  print("Counting with count()")
  for (var i = 1; i <= iterations; i++){
    start = new Date().getTime();
    var result = collectionUnderTest.count()
    duration = new Date().getTime() - start
    stats.insert({"type":"count","pass":i,"duration":duration,"count":result})
  }

  print("Counting with aggregation")
  for(var j = 1; j <= iterations; j++){
    start = new Date().getTime()
    var doc = collectionUnderTest.aggregate([{ $group:{_id: null, count:{ $sum: 1 } } }])
    duration = new Date().getTime() - start
    stats.insert({"type":"aggregation", "pass":j, "duration": duration,"count":doc.count})
  }

  var averages = stats.aggregate([
   {$group:{_id:"$type","average":{"$avg":"$duration"}}} 
  ])

  return averages
}

En keerde terug:

{ "_id" : "aggregation", "average" : 43828.8 }
{ "_id" : "count", "average" : 0.6 }

De eenheid is milliseconden.

hde