Auto incrementing IDs for MongoDB

If you’re familiar with relational databases like MySQL or PostgreSQL, you’re probably also familiar with auto incrementing IDs. You select a primary key for a table and make it auto incrementing. Every row you insert afterwards, each of them gets a new ID, automatically incremented from the last one. We don’t have to keep track of what number comes next or ensure the atomic nature of this operation (what happens if two different client wants to insert a new row at the very same time? do they both get the same id?). This can be very useful where sequential, numeric IDs are essential. For example, let’s say we’re building a url shortener. We can base62 encode the ID of the url id to quickly generate a short slug for that long url.

Fast forward to MongoDB, the popular NoSQL database doesn’t have any equivalent to sequential IDs. It’s true that you can insert anything unique as the required _id¬†field of a mongodb document, so you can take things to your hand and try to insert unique ids yourselves. But you have to ensure the uniqueness and atomicity of the operation.

A very popular work around to this is to create a separate mongodb collection. Then maintain documents with a numeric value to keep track of your auto incrementing IDs. Now, every time we want to insert a new document that needs a unique ID, we come back to this collection, use the $inc operator to atomically increment this number and then use the incremented number as the unique id for our new document.

Let me give an example, say we have an messages collection. Each new message needs a new, sequential ID. We create a new collection named sequences. Each document in this sequences collection will hold the last used ID for a collection. So, for tracking the unique ID in the messages collection, we create a new document in the sequences collection like this:

{
    "_id" : "messages",
    "value" : 0
}

Next, we will write a function that can give us the next sequential ID for a collection by it’s name. The code is in Python, using PyMongo library.

def get_sequence(name):
    collection = db.sequences
    document = collection.find_one_and_update({"_id": name}, {"$inc": {"value": 1}}, return_document=True)

    return document["value"]

If we need the next auto incrementing ID for the messages collection, we can call it like this:

{"_id": get_sequence("messages")}
Find and Modify – Deprecated

If you have searched on Google, you might have come across many StackOverflow answers as well as individual blog posts which refer to findAndModify()¬†call (find_and_modify¬†in Pymongo). This was the way to do things. But it’s deprecated now, so please use the new find_one_and_update¬†function now.

(How) Does this scale?

We would only call the get_sequence function before inserting a new mongo document. The function uses the $inc operator which is atomic in nature. Mongo guarantees this. So even if 100s of different clients trying to increment the value for the same document, they will be all applied one after one. So each value they get will be unique, new IDs.

I personally haven’t been able to test this strategy at a larger scale but according to people on StackOverflow and other forums, people have scaled this to thousands and millions of users. So I guess it’s pretty safe.

Understanding JWT (JSON Web Tokens)

In the end of our last post (which was about Securing REST APIs) we mentioned about JWT. I made a promise that in the next post, we would discuss more about JWT and how we can secure our REST APIs using it. However, when I started drafting the post and writing the code, I realized the underlying concepts of JWT themselves deserve a dedicated blog post. So in this blog post, we will focus solely on JWT and how it works.

What is JWT?

We will ignore the text book definitions and try to explain¬†the concepts in our own words. Don’t be afraid of the serious looking acronym, the concepts are rather simple to understand and comprehend. First let’s break down the term – “JSON Web Tokens”, so it has to do something with JSON, the web and of course tokens. Right? Let’s see.

Yes, a JWT mostly concerns with a Token that is actually a hashed / signed form of a JSON payload. The JSON payload is signed using a hashing algorithm along with a secret to produce a single (slightly long) string that works as a token. So a JWT is basically a string / token generated by processing a JSON payload in a certain way.

So how does JWT help? If you followed our last article, you now know why http basic auth is bad. You have to pass your username and password with every request. That is kind of bad, right? The more you send your username and password over the internet, the more likely it is to get compromised, no? Instead, on the first login, we can accept the username and password and return a token back to the client. The client passes that token with every request. We verify that token to see if it’s a logged in user or not. This is the idea behind¬†Token based authentication.

Random Tokens  vs JWT

How would you generate such token? You could generate a nice random string and store it in database against that user. Right? This is how cookie based session works too, btw. Now what if your application is scaled across multiple servers and all requests are load balanced? One server will not recognize a token / session generated by another server. Unless of course you also have one central database active all the time, serving all the incoming requests from all the servers. That setup is tricky and difficult, no?

There is another work around using sticky sessions where the requests from one particular user is always directed to the same server by the load balancer. This work around is also not as simple as JWT. Even if all these work nicely, we still have to make database queries to validate the token / session. What if we want to provide single sign on (users from one service wants to access resources on a different service all together)? How does that work? We will need a central auth server and all services will have to talk to it to verify the user token.

The benefit of JWT is that it’s lightweight but at the same time it’s a self contained JSON payload. You can store user identity in the JSON, sign it and send the token to the clients. Since it’s signed¬†we can verify and validate it with just our secret key. No database overhead. No need for sticky sessions. Just share the secret key privately and all your services can read the data stored inside the JWT. Others can’t tamper or forge a new, valid token for an user without that secret key. Single sign on just becomes a breeze and less complicated. Sounds good? Let’s see how JWTs are constructed.

Anatomy of JWT

A JSON Web Token consists of three major parts:

  • Header
  • Payload
  • Signature

These 3 parts are separated by dots (.). So a JWT looks like the following

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9.TJVA95OrM7E2cBab30RMHrHDcEfxjoYZgeFONFh7HgQ

If you look closely, there are 3 parts here:

  • Header: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9
  • Payload:¬†eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9
  • Signature:¬†TJVA95OrM7E2cBab30RMHrHDcEfxjoYZgeFONFh7HgQ

Okay, so we get the three parts, but what do they really do? Also those strings look like meaningless characters to me. What are they? How are they generated? Well, they are encoded in certain ways as we will see in the following sections.

Header

The header is a simple key value pair (dictionary / hashmap) data structure. It usually has two keys typ and alg short for type and algorithm. The standard is to have the keys at best 3 character long, so the generated token does not get too large.

Example:

{
  "alg": "HS256",
  "typ": "JWT"
}

The typ¬†value is JWT¬†since this is JWT we’re using. The HS256¬†is the most common and most popular hashing algorithm used with JWT.

Now we just need to base64 encode this part and we get the header string. You were wondering why the strings didn’t make sense. That’s because the data is base64 encoded.

Payload

Here comes our favorite part – the JSON payload. In this part, we put the data we want to store in the JWT. As usual, we should keep the keys and the overall structure as small as possible.

{
  "sub": "1234567890",
  "name": "John Doe",
  "admin": true
}

We can add any data we see fit. These fields / keys are called “claims”. There are some reserved claims¬†– keys which can be interpreted in a certain way by the libraries which decode the JWT. For example, if we pass the exp¬† (expiry) claim with a timestamp, the decoding library will check this value and throw an exception if the time has passed (the token has expired). These can often be helpful in many cases. You can find the common standard fields on Wikipedia.

As usual, we base64 encode the payload to get the payload string.

Signature

The signature part itself is a hashed string. We concatenate the header and the payload strings (base 64 encoded header and payload) with a dot (.) between them. Then we use the hashing algorithm to hash this string with our secret key.

In pseudocode:

concatenated_string = base64encode(header) + '.' + base64encode(payload)
signature = hmac_sha256(concatenated_string, 'MY_SUPER_SECRET_KEY')

That would give us the last part of the JWT, the signature.

Glue it all together

As we discussed before, the JWT is the dot separated form of the three components. So the final JWT would be:

jwt = header + "." + payload + "." + signature

 Using a library

Hey! JSON Web Tokens sounded great but looks like there’s¬†a lot of work involved! Well, it would seem that way since we tried to understand how a JSON Web Token is actually constructed. In our day to day use cases, we would just use a suitable library for the language / platform of our choice and be done with it.

If you are wondering what library you can use with your language / platform, here’s a comprehensive list of libraries – JSON Web Token Libraries.

Real Life Example with PyJWT

Enough talk, time to see some codes. Excited? Let’s go!

We will be using Python with the excellent PyJWT package to encode and decode our JSON Web Tokens in this example. Before we can use the library, we have to install it first. Let’s do that using pip.

pip install pyjwt

Now we can start generating our tokens. Here’s an example code snippet:

import jwt
import datetime

payload = {
    "uid": 23,
    "name": "masnun",
    "exp": datetime.datetime.utcnow() + datetime.timedelta(minutes=2)
}

SECRET_KEY = "N0TV3RY53CR3T"

token = jwt.encode(payload=payload, key=SECRET_KEY)

print("Generated Token: {}".format(token.decode()))

decoded_payload = jwt.decode(jwt=token, key=SECRET_KEY)

print(decoded_payload)

If we run the code, we will see:

python jwt_test.py
Generated Token: eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1aWQiOjIzLCJuYW1lIjoibWFzbnVuIiwiZXhwIjoxNDk0NDQ5OTQ0fQ.49okXifPSqc7n_n7wZRc9XVVqekTTeBIBBZdiH0nGJQ
{'uid': 23, 'name': 'masnun', 'exp': 1494449944}

So it worked Рwe encoded a payload and then decoded it back. All we needed to do is to call jwt.encode and jwt.decode with our secret key and the payload / token. So simple, no? parties.

Bonus Example – Expiry

In the following example, we will set the expiry to only 2 seconds. Then we will wait 10 seconds (so the token expires by then) and try to decode the token.

import jwt
import datetime
import time

payload = {
    "uid": 23,
    "name": "masnun",
    "exp": datetime.datetime.utcnow() + datetime.timedelta(seconds=2)
}

SECRET_KEY = "N0TV3RY53CR3T"

token = jwt.encode(payload=payload, key=SECRET_KEY)

print("Generated Token: {}".format(token.decode()))

time.sleep(10)  # wait 10 secs so the token expires

decoded_payload = jwt.decode(jwt=token, key=SECRET_KEY)

print(decoded_payload)

What happens after we run it? This happens:

jwt.exceptions.ExpiredSignatureError: Signature has expired

Cool, so we get an error mentioning that the signature has expired by now. This is because we used the standard exp claim and our library knew how to process it. This is how we use the standard claims to ease our job!

Using JWT for REST API Authentication

Now that we’re all convinced of the good sides of JSON Web Tokens, the question comes into mind – how can we use it in our REST APIs?

The idea is simple and straightforward. When the user logs in the first time, we verify his/her credentials and generate a JSON Web Token with necessary details. Then we return this token back to the user/client. The client will now send the token with every request, as part of the authorization header.

The server will decode this token and read the user data. It won’t have to access the database or contact another auth server to verify¬†the user details, it’s all inside the decoded payload. ¬†And since the token is signed and the secret key is “secret”, we can trust the payload.

But please make sure the secret key is not compromised. And of course use SSL (https) so that men in the middle can not hijack the token anyway.

What’s next?

JSON Web Token is not only about authentication. You can use it to securely transmit data from one party to another. However, it’s mostly used for authenticating REST APIs.¬†In our next blog post, we shall go through that use case. We will see how we can authenticate our api using JWT.

In the mean time, you can subscribe to the mailing list so you can stay up to date with this blog. If you liked the article and/or learned something new, please don’t forget to share it with your friends.

REST APIs: The concepts and applications

If you are into web programming you might have come across the terms “REST API” or “RESTful” and you might have felt curious as to what they really are. In this post, we shall take our time to go through some of the concepts / ideas behind REST and it’s applications.

The REST-less Developer

You might have noticed that the technology world is changing so fast. Back in the days, we used to love using Desktop applications. For example, we would probably want to check our emails on Outlook Express or Thunderbird then. Soon the web became popular and web applications started becoming extremely popular for the conveniences they offered. We now love Gmail, don’t we? But then we realized webmails are great and all but that’s not good enough. I want my emails on my phone and tablets. I for example, like to check my emails on my phone – I do that a lot because I can not be in front of a laptop all day. If you think carefully, in a few more years, we would want those emails on our wrist watches.

Now if you were the developer of such a popular email service, how would you serve these different types of devices? For example, the webmail can use HTML fine, mobile phones can browse HTML too but what about desktop and the smart watch apps? Also on phone, people would like native apps more than mobile web apps. So how to feed them data? Don’t be RESTless, some REST always helps! ūüėČ

The RESTful Web

The good old web was working very well for us, for a time being. But the hypertext / HTML is not the native tongue for many devices that we have to accommodate into our services. Also the web is no longer about just “documents”, it is so much more. The REST architecture can shape the modern web in a way that¬†provides uniform access to all our clients (devices). The core idea behind REST is that the client does not need to know anything before hand, it will connect to a server and the server will provide the client with available options via an agreed upon medium. For the web the medium is “HTML”.

For example, when your browser connected to this website, it didn’t know anything beforehand. The server served it a HTML page that has links to various posts and the search form to search content. By reading the source code (HTML), the client (browser) now knows what actions are available. Now, if you click a link or enter a search keyword and perform search, the browser would perform that action. But it has no idea what would happen next. When it performs the action, the server supplies new html telling it what it can do next. So the server is supplying information that the client can use to further interact with the server.

Hypertext or Hypermedia

But hey, not¬†every device can understand HTML, no? Yes, you are absolutely right. That is why the server is no way just confined to HTML. It can provide other responses too, for example XML and JSON are also valid (and two popular) medium of communication. This is why when we describe REST, we usually say “hypermedia” instead of “hypertext”.

The principle that the client does not need to know anything before hand and the server dynamically generates hypermedia¬†responses through which the client interacts with the server – this principle is aptly named “Hypermedia as the engine of application state” aka “HATEOAS“. That is one big name but if you read and think about it, it makes perfect sense. In this principle, the hypermedia generated by the server works as the “engine” of the application’s state. Cool, eh? HATEOAS is a key driving principle of the RESTful web but there’s more. Are you ready to dive in?

Fitting REST into HTTP and APIs

We now understand that in a REST like architecture, there will be a client and a server. The server will provide dynamically generated hypermedia on which the client will act upon. It all makes sense but how do we make our web APIs RESTful?

The idea¬†of communicating over HTTP¬†very often involves Verbs and Resources. Did you notice how very often the same URL can output different responses depending on which http method¬†(GET or POST) we used? The URL can be considered as a resource and the http methods are the verbs. There’s more to just GET and POST. There are PUT, PATCH, DELETE etc.

The purpose/intent of the common http verbs are:

  • GET: The purpose is to literally get the data.
  • POST: This method translates to “create“.
  • PUT / PATCH: We use these methods to update data.
  • DELETE: Come on, do I even need to explain what this one does? ūüėÄ

Now while building our APIs, we can map these verbs to our resources. For example, we have User resources. You can access it on http://api.example.com/user. Now when someone makes a GET request, we can send them a list of available users. But when they send new user data via POST, we create a new user. What if they want to view / update / delete a single user instance?

Resources: Collections vs Elements

We can broadly classify the resources into two categories – “collections” and “elements” and apply the http verbs to them. Now we have two different kinds of resources – “user collection” as a whole and “individual users”.¬†How do we map the different http verbs to them? Wikipedia has a nice chart for us.

For Collections: (/user)

  • GET – List all users
  • POST –¬†Create a new user
  • PUT – Replace all users with these new users
  • DELETE – Delete all users

For Elements: (/user/123)

  • GET –¬†Retrieve data about user with ID 123
  • POST – Generally not used and throws errors but can be used if the resource itself is a nested collection. In that case creates new element within that collection.
  • PUT – Replace the user data
  • DELETE – Delete¬†the user

Is this Official?

Everything makes sense and sounds good. So I guess everyone on the web follows this standard? Well, no. Technically, REST is an architecture or architectural style/pattern. It is not a protocol or a standard itself (although it does use other standards like XML or JSON). The sad fact is that nobody has to follow the principles. This is why we often would come across APIs which would not adhere to these principles and design things their own way. And that is kind of alright given REST is not engraved in a holy stone.

But should we follow the principle? Of course we should, we want to be the good citizens of the RESTful web, don’t we?

How can I consume  / create REST APIs?

You do have a curious mind, don’t you? What good is our knowledge of REST if we are not using it ourselves? Of course we shall. This blog post just scratches the surface of it. There is so much more to learn about REST and we shall learn those things in time. I have plans to write detailed posts on consuming and creating REST APIs in the future. You just have to stay in touch! If you haven’t yet, it would be a good idea to subscribe to the mailing list and I will let you know when I write the next piece, deal?

If you have any feedback on this post, please feel free to let me know in the comments. I hope you liked reading it. Also don’t forget to tell your friends about the wonderful world of REST, will you? ūüôā

Hello Polyglot Ninja!

If you’re like me, a programming / coding enthusiast, you would probably also start an introductory blog post with a “Hello World!” just like this one:

print("Hello World!")

So who am I? I am nobody significant, at least not yet. But I have big dreams. I learned programming out of passion. I started professionally with PHP, then learned Python and did a good amount of front end and backend Javascript on and off. These days, I mostly introduce myself as a full time Python developer. But deep inside, I am a Polyglot developer.

Who is a Polyglot Developer?

The word “Polyglot” refers to a person who knows and can use multiple languages. From that, we can safely assume that a person who is well versed in multiple programming language is a Polyglot Developer. We often see people around us say they are “PHP Developer”, “Python Developer”, “JavaScript Developer” etc. But hey, anyone working in the web industry for so long, probably knows JavaScript anyway. And if he’s using another language (ie. Ruby) in the backend, he knows 2 languages, right? So most of the full stack developers are polyglot anyway.

There are ¬†people who love exploring new languages out of curiosity and passion.¬†Learning a new language often teaches you new ways of thinking. You’re challenged to think in a different way. And as you gradually learn new techniques and arts to overcome these challenges, you become a better developer from inside. Clojure made¬†me use map, reduce, filter¬†and use recursions to iterate over a sequence – these were enlightenment to me. Soon I realized I have started applying the very same concepts in my Python code. The new ideas / concepts we come across in our newly found languages, we tend to bring them back to the languages we use day to day. And that very often results in better code.

Why should I become a Polyglot developer?

In the early days of my programming career, I had the very same question. Why should I “waste” my time learning Python if I can build all sorts of websites in PHP?

It is with experience and exploration, that I learned not every tool is suitable for every task. And programming is not just about building websites – there are so much more. Every programming language, every framework, every tool, every platform has it’s own use case. Remember – if X had no purpose, it would not have been created in the first place. And if Y has a decent user base / popularity, that means it at least does something better than the other available options.

You can not build a house just by using a hammer, or even if you can, you will have to go through a lot of agony, do extra work and the end result might not be good enough. Think. Every tool has it’s strength and a wise man uses the right tool for the job. He would use a Saw to cut through wood, not the hammer.

If you think about the use cases, you will notice, Python is very popular in Data Science and Machine Learning. There’s a very popular eco system of libraries, frameworks and lots of resources around Python in the machine learning or data science. While Ruby or say PHP can be used to implement some of the algorithms, it will be pain to do so. You will not find suitable, existing, matured libraries. Ruby or PHP will often be slow compared to Numpy or other libraries which are implemented internally in C. So if you’re smart, you would probably choose Python for such tasks. Python is also very popular in Web, System Administration, Desktop GUIs etc.

On the other hand, if you want to do front end of websites, you can not escape Javascript. Of course you can use TypeScript, ClojureScript – but to use them, you still need a certain level of basic knowledge of JS. And not to forget, those languages actually transpile to JS, that is they parse your code and produce JS from them.

If you’re into Big Data, Java/Scala is very prominent in that sector. Go and Rust are getting popular for performance and concurrency. Elixir is making web development fun again with it’s Phoenix framework. There’s a lot of programming languages – and none of them are truly useless. They all bring something to the table at the end of the day. And each one teaches us something new, improves our way of thinking.

But I can’t learn them all!

That is true. You can not learn them all, right now. But over time, with experience, you will be able to learn and get used to a significant number of popular programming languages. Also you probably don’t need to learn all of them either. If you come across a problem that can be best solved by a certain language, go ahead and learn it. Don’t just learn the syntax and the standard library, learn to write idiomatic code in the language. It will take time, give it time, don’t rush. Keep practising, you will get there.

 A Word of Caution

You want to become a Polyglot developer – we all do. That is a good thing. But don’t switch to a language because of the hype. Don’t start learning a language just because it’s a hip thing. Take your time, evaluate the language, check out the syntax, see what problems it can solve better, check out the community, maturity of the language and the eco system. Overall, think if learning the language would benefit you in anyway. You should only learn a language that you can use to solve a problem better. if you know PHP and you need to create a simple dynamic webpage, NodeJS won’t probably help you much.

Another very important thing would be learn one language very well before you start moving to the next one. When we learn our first language, we are not only learning a programming language but we’re also getting to know the concepts of programming for the first time. So take your time, learn all the concepts in depth. Gain some significant experience before exploring other languages.

 Which language should I start with?

You will get different suggestions on this one – some will say C, some will recommend Python, some people will advise Java and so on.

If you’re in your early days of programming, learning C has it’s benefits. You can better understand how things work under the hood. So do try C first and if you think you can pursue it, go ahead and get a good basic of C and C++.

If you didn’t like C – it looked very difficult and kind of scared you with all the memory management and pointer stuff, do try Python. It is a much nicer language, easier to get started.

When you have learned either of them well, go ahead and learn some new languages. May be Java? Golang? Rust? Well, everyone has his own preference. So make your own choice, try those languages and their use cases, pick the one you like for the job you want to use it for. Always remember – “right tool for the right job”.

 Where should I learn?

I will try to provide in depth guidelines / resources for different programming languages in the coming days. For now, Google for a good book and learn the syntax and standard library. There are plenty of free resources online. Once you have learned the basics, start solving problems on Hacker Rank¬†or CodeWars. if you’re stuck or need help, ask in StackOverflow¬†or Google for more. A good search engine like Google is a lifetime friend of a developer. So better get friendly with it. Learn some good techniques to find results fast. It will help you a lot in your coming days!

Where to go next?

Keep practising. Keep reading. Follow other fellow programmers on Twitter, see what they are up to. May be subscribe to some programming related sub reddits too?

And of course, don’t forget to subscribe to my mailing list. I don’t spam, you can unsubscribe any time. I shall be sending new post updates, essential guidelines and cool tips and tricks.

Happy learning!