Everything I learned about authentication and security
Updated: Jun 30, 2020
What exactly is a session?
What the heck is Stateful and Stateless authentication?
What is JWT?
access control and authorization?
I have been learning the auth flow in web apps for quite some time now.
I started with GraphQL. Now I am working heavily with REST API's being a part of Metamug.
Metamug specializes in the development and management of RESTful API's over SQL databases.
Underneath, I sum up all that I learned en route.
Confusion 1: Authentication and Authorization!
Authentication and Authorization are often used interchangeably, but they represent fundamentally different functions.
Authentication is identifying who the user is and Authorization is determining what this user has access to.
For example, imagine school management software. Everyone has access to basic info of the school like where its located, what are its key features, etc even if they are not logged in, i.e. no authentication.
Now, students can log in and view their test results, exam schedules, pending work submissions, etc. Here the user is authenticated as a student and authorized to view results and schedules. But it does not authorize them to set exam schedule, publish test results. Only the users logged in as a teacher are authorized to do these tasks.
Confusion 2: What exactly is a session?
In the context of an authentication system, a session is a time during which the server and the client computers can interact and interchange information.
A session is established at a certain point in time and then torn off at some later point. In a client-server distributed system which transfers data over HTTP protocol, the session is called an HTTP session. So, in a valid session, the client and server can exchange information through request-response over the HTTP protocol.
The next question that haunts people new to the auth flow is how to implement it. There are 2 types of session implementation:
Confusion 3: Stateful and stateless Authentication!
In Stateful sessions, the server maintains the session data in memory.
On the client-side, cookies are used to implement this.
When login is successful, the server stores the session info in the memory. The server then sets a cookie in the client. This cookie is a reference id to the corresponding session in memory.
Next time the client makes a request to the server, this cookie is sent. The server then checks this cookie and gets the user info from the corresponding session stored in the memory.
The session is stored in memory and not in a database because lookups in databases take extra time. This extra time will be added to all the requests made to protected resources, which will be bad.
As the server contains all the info of the user session, it can be revoked anytime.
If you have just a single server and there is no plan to scale further, this is easy to implement
Server overhead: The server needs to maintain all the session info. This uses more and more server resources as the number of logged-in users increase.
Cannot scale: As your users increase, you might need to horizontally scale your server architecture. But as the user sessions live in memory, i.e the RAM, all his requests must be handled by the same server where he created his session. This requires complex state synchronization logic.
Volatility: As the session is stored in RAM, every time the server restarts, there is a possibility of session data getting erased.
Furthermore, HTTP is itself supposed to be a stateless protocol. Stateful API's virtually make it look like stateful.
Whereas in stateless sessions, the server maintains no data of user sessions. Instead, an access token is used. JWT is a standard used for creating access-tokens. We will see the working of this method in the next section.
Confusion 4: What is JWT?
JSON Web Tokens are the industry standards for creating access tokens used in the stateless authentication. A JWT is just a very long string consisting of 3 parts separated by dots (.). The 3 parts are Header, Payload, and Signature respectively.
The header contains the algorithm and the token type info. The payload contains the user's information like user id from the database, expiration time, etc. The header and the payload are base64 encoded and signed along with a secret key that the server holds. Finally, all three parts are put together to get a JWT. The secret should be kept in a safe place and it should be only accessible to the server and only those parts of the server which absolutely need it.
Now let's see how JWT is used in stateless authentication.
When a user tries to log in, the server verifies their credentials. If credentials are valid, creates a new JWT and sends back to the client as the response.
Client stores this token someplace safe locally. In every request after logging in, the client sends this token as the "Authorization header". The server reads this header and verifies the JWT.
Authorization: Bearer <token>
This approach is called bearer authentication and the token involved is called a bearer token. It is a standard practice to use JWT as bearer tokens for the reason of reliability.
Learn more about stateful and stateless authentication here.
Confusion 5: Access Control and Authorization?
Lets briefly discuss Authorization first.
Authorization is usually done after the authentication process. Once the API grants access to the user, authorization verifies the user permissions through some policies and rules maintained by the API. It determines what the user can and can not access.
Access Control is the set of policies and rules used by the API. Access control is a section that controls authorization.
Types of access control:
Role-Based Access Control (RBAC)
This is the most common access control system in application security.
Here, named roles are created and access permissions are decided for each role. Each protected resource can be accessed with ideally a single role. A user can have single or multiple roles. This method is granular and works in most scenarios unless you have some crazy complex combinations of resources and roles.
Mandatory Access Control (MAC)
Mandatory access control is a centrally controlled system of access control in which access to some object (a file or other resource) by a subject is constrained. Only administrators can grant or revoke access to a particular resource.
Discretionary Access Control (DAC)
With discretionary access control, access to resources or functions is constrained based upon users or named groups of users. Owners of the resource decide who can access it. This is highly granular but can easily become too complex to manage.
There are still many confusions and misconceptions regarding authentication in software systems. Security is a very broad issue, all the methods, philosophies should be modified according to the actual use case.
I skipped the code implementation part here. Do you want me to show implementation in node js in a new article? Let me know!