Skip to main content

Two Contexts/Webapps Cache Problem

Today, we're going to discuss a simple bug that marked the beginning of my learning journey. It may seem basic now, but for someone from a humble background who had never even touched a computer, it was a significant challenge. Looking back, it might not seem like much, but it holds a special place in my heart as the starting point of my growth.

In the early days, we used the Common ClassLoader (tomcat/lib) for all web applications (contexts). Let's assume we had two contexts named app1 and app2—both applications' JAR files were placed in the common class loader, allowing them to access each other's classes. From a security perspective, this approach was not ideal. To address this, we switched to using the WebApp ClassLoader (tomcat/webapps/app1/WEB-INF/lib/), ensuring isolation between applications.

We maintain a HashMap that stores user details against a unique ID retrieved from cookies. If no entry exists for the given ID, we check the database to see if there are any records associated with it. If found, we repopulate the HashMap. When a user logs out, we remove their entry from both the HashMap and the database.

In the common class loader setup, when a user logs out from app1, their entry is removed from both the common HashMap and the database. Since app2 shares the same class loader, the user's session is also cleared there, effectively logging them out from app2 as well.

However, after moving to the WebApp ClassLoader model, when a user logs out from app1, we clear their cache entry from app1's HashMap and remove the corresponding database entry. But since app2 maintains its own separate HashMap, the cache still exists there. As a result, the user can continue accessing app2 using the unique ID from the cookie, even though the database entry has been invalidated.

Instead of mitigating the security threat, we ended up creating a SEVERE security vulnerability. While trying to isolate applications using the WebApp ClassLoader, the session cache in App2 remained intact even after a user logged out from App1. As a result, the user could still access App2, despite their session being invalidated in the database.

As a solution, we migrated the HashMap from JVM cache to Redis cache. This ensured a centralized session store, allowing consistent session management across applications and preventing unauthorized access after logout.



Comments

Popular posts from this blog

One minute problem (Clock Drift Issues)

While working on a chatbot project based on Apache OpenNLP , we followed a microservice architecture , as the product was new and being built from scratch. We had multiple independent services that communicated internally using a custom token along with a user ticket for authentication. The custom token was time-based , valid for only one minute , and required a positive time difference.  If the client’s time was ahead of the server, the generated token would be considered invalid before it even reached the server because the server calculates the time difference as a negative value, which violates the validation rule requiring a positive time difference within one minute . A request would be processed successfully only if both the custom token and the user ticket were valid. If either validation failed, the request would return an error response . We deployed all microservices on a centralized high-performance server , allowing each service to run independently. During develo...

Out of Memory Issue in Production

When I was working on an NLP-based chatbot, we occasionally encountered Out of Memory (OOM) issues. These were not frequent enough to disrupt the service entirely, but happened often enough to raise concern. Since we were building the application from scratch, we hadn’t initially included the appropriate Java arguments to generate a heap dump, which made it difficult to debug the issue when it happened. These OOM errors occurred sporadically, and during that time, the system was serving more than 50,000 concurrent users. Since none of the customers reported issues, we couldn't immediately prioritize deeper investigation. Eventually, we added the necessary Java arguments to generate a heap dump when an OOM occurs and waited for the issue to happen again. While waiting may not sound like the best strategy, we did our best to reproduce the problem under controlled scenarios. If you're interested in learning how to capture and analyze heap dumps, I’ve written a dedicated post about...