Friday, April 7, 2017

Getting started with MongoDB - Basics


Introduction


DML = Data manipulation language
DML keywords = Insert, update, delete, select

No hard and fast Rules
  • MongoDB is a No SQL database. 
  • It's written using C++.
  • It stores JSON data in the BSON format. 
  • Once we query the data, mongoldb will automatically convert the bson -> json when reading and json -> bson when storing. This process is handled by the mongodb driver.
  • MongoDB stores data in BSON but why?
    • Traversable
    • Efficient 
    • Lightweight
  • Advantages 
    • Performance - performance effecting queries such as joins are not needed. 
      • Example :- In relational databases, we use triggers(insert triggers(inserted table), update triggers) which makes a single CRUD operation run twice or more depending on the trigger. This will drastically reduce the database performance when the database gets bigger. suppose that your relational database contains about 5GB of data. Then it will take a long time to process a simple insert query with a trigger. That functionality is not needed in mongo db.
    • Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language that's nearly as powerful as SQL.
    • A clear path to horizontal scalability. 
    • No schema migrations. Since MongoDB is schema-free, your code defines your schema.
  • Disadvantages
    • MongoDB is not ACID compliant.( MongoDB is eventually persistence)
      • A - Atomicity
      • C - Consistency
      • I - Integrity - Since mongoDB doesn't assure data integrity,  it's not a good practice to use mongo db for transactions.
      • D - Durability  
A document-based data model. The basic unit of storage is analogous to JSON, Python dictionaries, Ruby hashes, etc. This is a rich data structure capable of holding arrays and other documents. This means you can often represent in a single entity a construct that would require several tables to properly represent in a relational db. This is especially useful if your data is immutable. [Quoted from stack overflow]


Terms

  • Collection - Similar to a table in RDBMS
  • Document - Similar to a tuple in RDBMS
  • Field - Similar to a Column in RDBMS


Setup

  • Install MongoDB - Complete guide is available on mongodb.com
  • Create a data folder in
    • Windows - C:\data\db
    • Linux - /data/db
  • Switch to the terminal and type the following commands
    • sudo mongod
  • open a new tab in the terminal and type
    • mongo #This will start the mongo console. You can execute any command here.

Basic Commands and Queries


Show the available databases 

show dbs

Show the current db

db

Switch to a particular database

use database_name #This will switch to the database only if it's available. Otherwise, mongodb will create a new database.

Drop the database

Switch to the database you want to drop and type the following command
db.dropDatabase();

Let's look into CRUD Operations


Inorder to test the crud operations, We need to have some data in the database. We'll use a json generator for this purpose. Go to http://www.json-generator.com/ and click on Generate button to generate a  random json data.Same as the databases, the collections will be automatically created upon the data insertion.

Insert a single document

db.collection_name_here.insert({});

Example : - 

db.collection_name_here.insert({
"name" : "your name",
"age" : 22
})

Insert multiple documents at the same time

db.collection_name_here.insert([{},{},{},{}])

Example : -

db.collection_name_here.insert([{"name" : "your name", "age" : 22},{"name" : "your name", "age" : 22},{"name" : "your name", "age" : 22},{"name" : "your name", "age" : 22}])

Note the syntax difference between inserting a single document and multiple documents

Finding all Document/Documents


db.collection_name_here.find().pretty()

pretty() command will format the json.

Finding a single Document

db.collection_name_here.findOne()

Delete

db.collection_name_here.remove(Deletion Criteria)

Example : -

db.collection_name_here.remove({"_id":"23343453"})
db.collection_name_here.remove({"gender":"make"})

Update

db.collection_name_here.update({unique document identifier},{updating json object})

Example : -

db.collection_name_here.update({"_id":"3435435e454355"}, {
"name" : "my name",
"age" : 23
})

More on Finding


Let's find a single document by id


db.collection_name_here.find({"_id":"39043890353434343"})


Find based on multiple criteria


db.collection_name_here.find({"age": 20 },{"gender":"male"},{"marks":"27"})

Using $or, $and


db.collection_name_here.find({$or: {"gender":"male"},{"age",24}})

db.collection_name_here.find({"University":"SLIIT"},{$or: {"gender":"male"},{"age", 20}})

db.collection_name_here.find({$and: {"gender":"male"},{"age",24}})

db.collection_name_here.find({"University":"SLIIT"},{$and: {"gender":"male"},{"age","20"}})

Using $ne, $gt , $gte , $lt , $lte


db.collection_name_here.find({$or: {"age": {$gte: 23}},{"isEnrolled": {$ne : true}}})

db.collection_name_here.find({"University":"SLIIT"},{$and: {"gender":"male"},{"age",{$gt : 20 }}})

Displaying only the required fields


Syntax :


db.collection_name_here.find({"Your query"},{"Specify the fields you want to display"})

Example : 


{"field_name": 1} <- Displays particular filed
{"filed_name" : 0} <- Doesn't display the specified field

db.collection_name_here.find({"University":"SLIIT"},{$and: {"gender":"male"},{"age",{$gt : 20 }}}, {"name":1,"_id": 0})
#the above code will only display the name fields


Indexing


A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without having to search every row in a database table every time a database table is accessed. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.
An index is a copy of selected columns of data from a table that can be searched very efficiently that also includes a low-level disk block address or direct link to the complete row of data it was copied from.[wiki : search : Indexing databases]


  • Pros
    • Indexing speeds up the efficiency and performance of finding (Select / Find statements).
  • Cons
    • Indexing decreases the performance of inserts, updates and deletes.
      • What happens - When you index, an index table will be created on the index field and the primary key field. This index table is saved in the hard disk which takes up space. But when you are using a index field, the index table will be taken to the RAM.
      • Why performance ? When ever you are writing data(update, delete and inserting), the database has to automatically update the index table. So one insert operation does 2 inserts. So the performance decreases.

Adding an index


db.collection_name_here.ensureIndex({"age": 1}) 
db.collection_name_here.ensureIndex({"name": 1})


Getting all the indexed fields


db.collection_name_here.getIndexes()

Removing an Index


db.collection_name_here.dropIndex({"age": 1 })


Aggregation and Grouping


In the relational databases we use count, max, min and average keyword. But how do we do it using mongoldb  ?

db.collection_name_here.aggregate(
$group : {
"_id" : "$age",
"total" : {$sum : 1}
}
)

db.collection_name_here.aggregate(
$group : {
"_id" : "$marks",
"Average" : {$avg : 1}
}
)


db.collection_name_here.aggregate(
$group : {
"_id" : "$marks",
"Average" : {$max : 1}
}
)


Where to go from here

  • There are lot of  courses available at mongoDB university. you can follow one of these and get certified from the mongodb itself.
  • If you own a website which has sql, best way to learn mongo is to convert that website into mongo.



4 comments: