On: some gotchas working with mongoDB

Categories Uncategorized

UPDATE: Nodejs driver 3.X.X introduces a bunch of incompatible changes compared to 2.X.X versions. For one, you will need to use projections instead of field selectors. Also, you will need to pass results from functions by calling callback() which will pass the variable to the caller.

function GetTenantIDFromName(exp, db){
  let rxPtt = new RegExp(exp, 'gi');
  let tID = db.collection('Tenant')
    .find({ "Name": { $regex: rxPtt }},{"_id":1})
    .toArray();   
  return tID;
}

// possible way to call this
let tID = GetTenantIDFromName(tenant, db);
tID.then(function(polluterTenantId){
  console.log(polluterTenantId);
});

function GetTenantIDFromName2(exp, db, callback) {
  const rxPtt = new RegExp(exp, 'gi');
  const coll = db.collection('Tenant');
  coll.find({ "Name": { $regex: rxPtt }}).project({"_id":1}).toArray(
    function(err, res){
      if(err) throw err;
      callback(res[0]['_id']);
    });
}

// calling this
GetTenantIDFromName2(tenant, db, function(res){
  console.log(res);
})

… and now, lets return to the regularly scheduled program…


Working with MongoDB can be frustrating or it can be fun. Most of the time when it’s fun – it is for the wrong reasons, like observing others not understanding something 😀 In this article I intend to share some solutions to problems I encountered working w/ this database.

Connection string patterns

There are 3 ways general ways to connect to mongo:

  1. Using a driver for a particular language: Java, Ruby, Javascript. This will be the well-documented URI format.
  2. CLI client.
  3. Off-the-shelf mongo GUI client – like Robomongo (Robo 3T) or Studio 3T (self-described mongo IDE).

For the first two of these the connection string pattern is as follows, respectively:

mongodb://admin:{password}@host1:27017,host2:27017,host3:27017/db_name?replicaSet=rs0&authMechanism=MONGODB-CR

mongo --authenticationDatabase admin --authenticationMechanism MONGODB-CR -u admin -p {password} --host rs0/host1:27017,host2:27017,host3:27017 db_name

These connections illustrate the connection to a replica set (a.k.a a cluster), which is the most common way to deploy mongoDB in live systems. If you are connecting to a single host, use these:
mongodb://admin:{password}@host:27017/db_name?authMechanism=MONGODB-CR

mongo --authenticationDatabase admin --authenticationMechanism MONGODB-CR -u admin -p {password} --host host:27017 db_name

Changing the hostnames of the replica set members

The official (and unofficial) documentation describes two ways of changing the hostnames of a replica set members:

  • High-availability (HA) approach – change non-primary node hostnames > stepdown then primary and > change it’s hostname;
  • Availability-disrupting approach;

I advise using (and remembering) the HA approach for the simple reason that it requires fewer assumptions and can be used in production environments (so it’s more general). For example, the system.replset is a system database and your user needs to have the appropriate access rights in mongo DB to reach it and perform cfg = db.system.replset.findOne({"_id":"rs"})which required for the 2nd approach. Needless to say, you might not have access to this database. On the other hand, most users will be able to perform: cfg = rs.conf(); cfg.members[0].host = "blah:27017" that are required by the HA approach.

Making the execution of queries sequential and deterministic

Flipping the values of some fields in mongo collection is not that easy as it might initially seem. Let’s say you have a collection named “Car” (it would be more appropriate to call them Cars, I know) and they have a state “on” or “off” which is represented as a field called “State”. Let’s create such an arrangement:

db.createCollection("Car");
db.Car.insert({"_id" : "1", "ManufacturerID" : "e46dba7", "State" : "On"});
db.Car.insert({"_id" : "2", "ManufacturerID" : "e46dba7", "State" : "On"});
db.Car.insert({"_id" : "3", "ManufacturerID" : "e46dba7", "State" : "Off"});
db.Car.insert({"_id" : "4", "ManufacturerID" : "e46dba7", "State" : "Off"});
db.Car.insert({"_id" : "5", "ManufacturerID" : "e46dba7", "State" : "Off"});

So you want to deterministically swap the values so that every car that was “on” would switch to “off” and vice versa – flip the values.
Lets start with an obviously wrong solution using the {cursor}.forEach() pattern. Why discuss a wrong solution first? Since it uses the the common pattern it will probably be the first idea that will pop into peoples heads (i) also, it is a good example to compare the other solutions to (ii).
Approach number 0:
The wrong example illustrates how a possible way of working with cursors can guide thinking. Because this way of working with cursors is not general, it is not as useful to learn by heart for supporting staff (I always benefited from committing things to memory when I was supporting live systems, as speed is of the essence there). Thus I would propose learning and using the example that works in most general cases and only using this one if ever needed… but let’s continue:
db.Car.find({ManufacturerID: 'e46dba7', State: "Off"})
  .forEach(function(doc){
    db.Car.update(
      {_id: doc._id},
      {$set:{"State": "On"}},
      {writeConcern: { wtimeoutMS: 50000 }}
    );
  });

db.Car.find({ManufacturerID: 'e46dba7', State: "On"})
  .forEach(function(doc){
    db.Car.update(
      {_id: doc._id},
      {$set:{"State": "Off"}},
      {writeConcern: { wtimeoutMS: 50000 }}
    );
  });

This gives us the following result:
mongo_tshoot_1

Approach number 1:
This approach produces a table with all the Cars in “On” state… and it quite surprising.

var CarsOn = db.Car.find({ManufacturerID: 'e46dba7', State: "On"});
var CarsOff = db.Car.find({ManufacturerID: 'e46dba7', State: "Off"});

while (CarsOn.hasNext()) {
  doc = CarsOn.next();
  db.Car.update(
    {_id: doc._id},
    {$set:{"State": "Off"}},
    {writeConcern: { wtimeoutMS: 50000 }}
  );
}

while (CarsOff.hasNext()) {
  doc = CarsOff.next();
  db.Car.update(
    {_id: doc._id},
    {$set:{"State": "On"}},
    {writeConcern: { wtimeoutMS: 50000 }}
  );
}

mongo_tshoot_2

What is the order of execution in this sequence of statements? We can inspect it by inserting some print() probes:

print("Cursor initialization - before");
var CarsOn = db.Car.find({ManufacturerID: 'e46dba7', State: "On"});
var CarsOff = db.Car.find({ManufacturerID: 'e46dba7', State: "Off"});
print("Cursor initialization - after");

print("CarsOn while - before");
while (CarsOn.hasNext()) {
  print("CarsOn while loop - start");
  doc = CarsOn.next();
  db.Car.update(
    {_id: doc._id},
    {$set:{"State": "Off"}},
    {writeConcern: { wtimeoutMS: 50000 }}
  );
  print("CarsOn while loop - end");
}
print("CarsOn while - after");

print("CarsOff while - before");
while (CarsOff.hasNext()) {
  print("CarsOff while loop start");
  doc = CarsOff.next();
  db.Car.update(
    {_id: doc._id},
    {$set:{"State": "On"}},
    {writeConcern: { wtimeoutMS: 50000 }}
  );
  print("CarsOff while loop end");
}
print("CarsOff while - after");

The result is as weird as it gets even though the order of execution seems to be expected! The result is the same as if you were to execute the statement from Approach number 0! WooT!
mongo_tshoot_2
Approach number 2:
Lets try exhausting the cursor before running the update:
var CarsOnArr = db.Car.find({ManufacturerID: 'e46dba7', State: "On"}).toArray();
var CarsOffArr = db.Car.find({ManufacturerID: 'e46dba7', State: "Off"}).toArray();

for (i = 0, len = CarsOnArr.length; i < len; i++) {
  db.Car.update(
    {_id: CarsOnArr[i]['_id']},
    {$set:{"State": "Off"}},
    {writeConcern: { wtimeoutMS: 50000 }}
  );
}

for (i = 0, len = CarsOffArr.length; i < len; i++) {
  db.Car.update(
    {_id: CarsOffArr[i]['_id']},
    {$set:{"State": "On"}},
    {writeConcern: { wtimeoutMS: 50000 }}
  );
}

Now the results are correct. However, and important point should be raised, that if you are updating the results with a loop it takes time, so you can see inconsistency when doing a read, here is an example. I was doing a read while the update was running:
mongo_tshoot_4

After the update was finished the expected result was there:
mongo_tshoot_5

Leave a Reply