Mongodb-Cheetsheet

2017-11-01

Quick Start

Overview

So was using Mongodb for a side project and it was an interesting experience.
Below are some notes in form a cheat sheet (helps me memorize everything)

See conclusion at the bottom but the TLDR is
really enjoyed working with mongo would use again but there are risks from structured side and user defined functions.

Basic info

Mongo to SQL Mapping:
Mongo <—> SQL
Collection <—> Table
field <—> column
$lookup <—> table joins
$match <—> WHERE
$group <—> GROUP BY
$match <—> HAVING
$sort <—> ORDER BY
$limit <—> LIMIT
$sum <—> SUM()
$sum <—> COUNT()

Sorting and limiting

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// asc sort
db.acollection.find().sort({_id:1});
// desc sort
db.acollection.find().sort({_id:1});
// limit results (prevent floot)
db.acollection.find().sort({_id:1}).limit(50);
// last record added
db.acollection.find().sort({_id:-1}).limit(1);
// first record added
db.acollection.find().sort({_id:-1}).limit(1);

Aggregating

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
db.generic_table.aggregate([
{
"$match":{'datetime_entry': {$gt: d}}
},
{
$group: {
_id: {
year:{$year:"$datetime_entry"}
},
avg_on_param1: {
$avg: "$param1"
},
number_samples: {
$sum: 1
}
}
}
])

The gt and lt is greater than and lesss than filters. As listed in table at top $match is the “where” and “having” filters

creating new database and create new collection

1
2
3
4
5
6
// Select new database for use which creates it
use new_db
// new collection for db creation tracking
db.db_details.insert({"creationdate":"1 Nov 2017"})

User Grants

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
db.grantRolesToUser("generic_admin",[
{
"role" : "dbAdmin",
"db" : "new_database"
},
]);
// grant read write level access
db.grantRolesToUser("generic_user",[
{
"role" : "readWrite",
"db" : "new_database"
},
]);

Creating Functions and grant to user

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
//Create Function
function (){
// add your logic can varables here
var d = new Date();
d.setDate(d.getDate()-3); //specify number days back
return
// Add mongo db query here can add imput values
}
// gives user rights to execute functions
db.grantRolesToUser("generic_user", [ { role: "executeFunctions", db: "admin" } ])

A really imporant note is that function names need to be unique across all databases which you can imagine can cause issues.
Also the eval method used by python and mongo to execute functions looks like it will be removed in the future. The work around is to same complex queries in program instead a function which isnt going to be fun.

Mongodb server file config

The config file for mongo is in
/etc/mongodb.conf

The important one to note is the small log file setting. As the logging will fill up machine really quick but is usefull for production.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#where to log
logpath=/var/log/mongodb/mongodb.log
logappend=true
rest = true
smallfiles=true
bind_ip = 0.0.0.0
port = 27017
# Turn on/off security. Off is currently the default
#noauth = true
auth = true
# Disable the HTTP interface (Defaults to localhost:27018).
httpinterface=true
#nohttpinterface = true

The other important setting is noauth if set true there will be no passwords. Its usefull to reset admin passwords. But be sure to turn this on by default.

The standard bind ip 0.0.0.0 is to make accept requests from the network not just local host.

I like to turn on the httpinterface (and rest) on port 27018 and 28018 as this shows usefull db information but be sure to block this to outside traffic.

Conclusion

The good

Mongo really quick to setup and performance is smooth and quick. (some advanced scaling features are available as well).
It shines with rapid development as can easily add object to collection. (without having to redo data models and updating schemas)

Its python library pymongo works really well and works nicely with pythons dictionary datatypes. This makes data recall and storage quick and pleasant.

The bad

The structure of the database isn’t rigidly defined like sql and need to carefully manage what object data is loaded in what collection.
So try not to mix them. If adding extra data field would recommend going back to old objects and update to same form if possible.
I also recommend adding field “type” just for safety and sanity. I production may want to setup some integrity checks.

Second concern is the support for the user defined functions with the eval method losing support in the future