Tuesday, September 10, 2024

Debunking Common CQRS Myths

CQRS (Command Query Responsibility Segregation) is an architectural pattern that focuses on separating the handling of commands (write operations) from queries (read operations). While simple in concept, numerous myths and misunderstandings surrounding its use can lead to unnecessary complexity, especially in terms of system architecture and user interface design. Let’s address these common myths, layer by layer, and clarify what CQRS really involves.



Myth 1: CQRS and Event Sourcing Are the Same Thing

One of the biggest misunderstandings is that CQRS always involves event sourcing, or vice versa. While these two concepts are often used together, they are not dependent on each other.

Event sourcing focuses on storing every state change as an event and rebuilding the state by replaying these events. This can simplify building read models over time in event-driven systems. However, you can implement CQRS without using event sourcing at all. Likewise, you can use event sourcing without applying CQRS. The two are distinct architectural patterns that, although complementary, serve different purposes.

Event sourcing is generally more useful when working within bounded contexts and dealing with systems where historical state tracking is important. However, the overhead of managing event sourcing can be significant, and it shouldn’t be applied unless there is a clear need for it.

Myth 2: CQRS Requires an Eventually Consistent Read Store

Another common misconception is that CQRS mandates the use of an eventually consistent read store, where the results of a command (write operation) take some time to reflect in the query (read side). This is not a requirement.

Immediate consistency is entirely possible in a CQRS setup, where the read model is updated as soon as the command succeeds, all within the same transaction. In fact, in many existing systems, transitioning from an immediate to an eventual consistency model can add unnecessary complexity and confuse users who expect instant updates. It’s often easier and more effective to start with immediate consistency and gradually shift to eventual consistency where it is genuinely needed, rather than forcing it upfront.

Transitioning to eventual consistency should be a gradual process, especially when user experience and expectations are at stake. For example, if users expect instant updates after they submit a request, suddenly shifting to eventual consistency could frustrate them unless the underlying business processes also change to accommodate this.

Myth 3: CQRS Requires a Message Bus, Queues, or Asynchronous Messaging

A lot of people mistakenly believe that implementing CQRS means you need to use message buses or asynchronous messaging systems like NServiceBus. This isn’t the case.

While asynchronous messaging systems can be useful for handling eventual consistency in more complex scenarios, there’s nothing in CQRS that explicitly requires this. You can very well implement CQRS without any form of messaging infrastructure. Whether to use queues or a message bus depends entirely on the consistency requirements and scalability needs of your system.

The takeaway here is to avoid unnecessary complexity at the start. Don’t introduce queues or a bus until you know you need eventual consistency, or have proven that your system benefits from asynchronous processing. Immediate consistency with simpler infrastructure might be sufficient for many use cases.

Myth 4: Commands Are Always Fire and Forget

Another common myth is that commands in CQRS are inherently fire-and-forget, meaning that after a command is issued, there’s no need for feedback to the user. In practice, this is rarely the case.

Most business operations require at least a basic level of confirmation. Users need to know if their request was successfully received and accepted. While the actual fulfillment of the command can happen asynchronously, the acceptance of the request should typically be handled synchronously. This can be as simple as providing an acknowledgment message that the system has registered the request.

In scenarios where fulfillment takes time (e.g., processing payments or large data operations), you’ll likely need to introduce processes like sagas or workflows to handle long-running tasks and provide updates to the user over time. Fire-and-forget is generally too simplistic for real-world business needs, where feedback and request correlation are critical.

 

Myth 5: Read Models Must Be Eventually Consistent

Many assume that the read models in CQRS must always be eventually consistent, where the results of write operations don’t immediately reflect in the read view. This assumption is misguided.

Read models only need to be eventually consistent when the business requirements demand it. For many systems, immediate consistency is a perfectly valid approach, especially when users expect real-time feedback. Before deciding on eventual consistency, you should carefully assess whether delayed updates will affect the user experience and how your system can handle failures and delays.

Switching to eventual consistency means introducing a whole new set of challenges, like handling failed updates to the read model, or figuring out how to manage the user experience when data isn’t immediately available. You need to ensure that your system can gracefully handle these scenarios, or else you’ll likely encounter more support issues than before.

Myth 6: CQRS Solves Consistency and Concurrency Issues

There’s a false belief that CQRS automatically fixes issues related to data consistency and concurrency. This couldn’t be further from the truth.

In fact, if you try to handle all commands in a strictly serialized manner to avoid concurrency issues, you might end up with performance bottlenecks. CQRS doesn’t eliminate concurrency problems; it simply shifts them. On the query side, you also have to deal with potential out-of-order events, duplicate events, or event failures. Denormalizing read models to handle such situations is possible, but it still requires careful design.

CQRS won’t let you escape these challenges, and it doesn’t automatically lead to scalable systems. You still need to address concurrency and consistency in both the command and query sides of the architecture.

Myth 7: CQRS Is Easy to Implement

Despite its conceptual simplicity, CQRS is far from easy to implement in practice. The separation of concerns between commands and queries may seem straightforward, but many implementations fail because of a lack of understanding of the business domain.

CQRS doesn’t replace the need for a deep understanding of business requirements. It might help in organizing and fulfilling those needs more effectively, but it doesn’t guarantee success. You can still build the wrong system with CQRS if you don’t fully grasp what the business truly needs.

Replacing legacy systems with a CQRS architecture also comes with significant risk. A complete rewrite is always dangerous, and the mere presence of CQRS doesn’t mitigate those risks. You’ll need to think through these transitions carefully, keeping business priorities in focus.

 

Myth 8: CQRS Requires Separate Databases

One myth that needs to be dispelled is the idea that CQRS requires separate databases for handling commands and queries. This is not true.

CQRS does not require the use of separate databases. What it mandates is separate object models for handling commands and queries, but these models can reside within the same database. You can split the models based on their responsibilities without having to create two separate databases.

That said, using separate databases can be beneficial for performance or scalability reasons, but it’s entirely optional. The core of CQRS is about separating the responsibilities, not necessarily the physical data storage.

Myth 9: CQRS Always Requires Separate Models for Reads and Writes

While CQRS enables you to create separate models for reading and writing, it does not always require this approach. In simpler systems or early-stage implementations, you might still use a shared model for both reads and writes, gradually transitioning to separate models if the business demands it.

The power of CQRS lies in its flexibility. It allows you to optimize each side (command and query) independently, but it does not impose rigid rules about how that optimization must happen.

The Takeaway

CQRS is a flexible and powerful architecture for separating the concerns of commands and queries, but it doesn’t come with the rigid requirements that many assume. It doesn’t mandate eventual consistency, separate databases, or event sourcing, and it doesn’t solve concurrency issues by itself. Above all, CQRS should be applied with a clear understanding of the business’s needs, and its complexity should only be introduced as required. Keep your implementation simple and build complexity only where necessary to truly meet the goals of your system.

 

Monday, September 2, 2024

The Basics of RESTful API Security: A Beginner's Guide

A major portion of the applications in today's digital, networked world has been developed based on the backbone of a RESTful API to pass information between different software systems. As a software developer, security becomes an important thing to take care of. The following beginner's guide will take one through the key concepts of RESTful API security: authentication, authorization, encryption, and data validation.



Understanding RESTful API Security

Since the RESTful APIs are designed to be stateless, the request made by any client to the server must contain complete information to allow the execution of an operation. Designing for API RESTfulness is very convenient, but it also results in some potential risks in terms of security: unauthorized access, breach of data, and manipulation of sensitive information.

1. Authentication: identity of user

Authentication involves verifying the identity of a user or system trying to gain access to an API. An authentication layer is a measure that presents the first line of defence for APIs and strives to lock out ill-intentioned users from making requests against the API.

    Common Authentication Methods:

  • API Keys: This is a very simple solution where the client gets a key in particular that he includes in the header of the request. However, not secure at all—in fact, it is very weak—because if unencrypted, this key could easily be intercepted.

  • OAuth 2.0: Likely the most implemented protocol, allowing a third-party application to obtain API access for an end-user without supplying credentials. OAuth 2.0 is token-based, so it's way more secure and flexible in how authentication can be done.

  • Basic Auth: This is the base64 encoding of a username and password that gets sent with every single request to an API. It should, thus, only be used over HTTPS, as the encoding will easily be decoded if intercepted.

2. Authentication: Access Control

Authorization is the act performed after authentication, which decides what the user who has been authenticated may do. Said differently, once authentication confirms the identity of the user, authorization looks at whether he is authorized to carry out a certain action or access particular data.

    Implementing Authorization:

  • Role-Based Access Control: Users are assigned different roles, and the role would include specific permission. A very common example is an admin being able to access all the API endpoints, while a regular user can only have a limited set of actions.

  • OAuth 2.0 Scopes: The allowed actions of an Access Token are defined by its scopes. A scope is a constraining of the set of actions that can be performed using an access token. Example: A read-only token to user data.

3. Encryption: Data Protection

 Encryption is an essential tool in data protection, both when it is at rest and when in transit. It surely makes data, which falls into the wrong hands, unreadable.

    Encryption Methods:

  • TLS: Data travelling in between the client and server is encrypted. Theoretically, this could prevent an attacker from reading the data by catching it in transit. Always use HTTPS-HTTP over TLS when communicating with APIs.

  • End-to-end encryption: data is encrypted on the client side and decrypted only on the server side, hence making it impossible to steal data when intercepted in transit.

4. Data Validation: Ensuring Data Integrity

Validation: The data is checked at the server to ensure that it comes from the client without malicious data and is complete in the right way. This is an important step for validating user inputs against SQL injection and cross-site scripting, among other manipulations.

    Best Practices for Data Validation

  • Input validation: Ensure data that is received at the server is validated properly. Validate the type, format, and length of input data. 

  • Output Encoding: Encode the output data to prevent injection attacks. This is specifically important if the output data is being pumped back into a web page or in the database query. 

  • Schema Validation: Another way they provide validation is in the structure of incoming data with JSON Schema, which ensures conformance to the expected schema. 


Security considerations when using RESTful APIs touch many layers: they start at the beginning with authentication and authorization, go on to encryption, and end with data validation. These basic security practices will allow a person to build strong APIs that protect sensitive data and ensure access to services only for people meant to be using them. As you go deeper into developing and scaling your APIs, consider adding additional advanced security features, such as rate limiting, whitelisting IPs, and regularly running security audits. A proactive approach to security means you will set the bar high in terms of users' trust in your applications.


Thursday, December 14, 2023

Part-4 : Navigating the Microservices Maze: Strategies for Greenfield and Brownfield Projects

The journey from monolithic architectures to microservices is fraught with complexity. However, with a strategic roadmap, organizations can navigate this maze, whether they're embarking on a new project or transforming an existing system. This blog offers an in-depth look at the strategies for transitioning to microservices in greenfield and brownfield scenarios, complete with real-world examples.


 

Before diving into strategies, it's essential to understand the two terrains we're dealing with:

  • Greenfield Projects: These are new projects with no legacy codebase, offering the freedom to build from scratch.

  • Brownfield Projects: These involve existing systems where the goal is to incrementally replace or update the architecture.

 

Greenfield Strategies: Limited Resources vs. Resourced Teams


Limited Resources

For teams with limited resources, starting with a modular monolith can be a wise choice. Each module within this monolith acts as a future microservice. For instance, Amazon started as a monolithic application but over time, it refactored its architecture into microservices to scale effectively.

 

  • Developing bounded contexts: Each module, or bounded context, is designed to handle a specific business capability. As in the case of Uber, which initially developed a monolithic codebase that was later decomposed into hundreds of microservices as they expanded globally.

  • Applying separation patterns: These are essential for decoupling modules. An example is the Facade pattern, which simplifies the interface presented to other modules or services, much like a simplified, unified front-end for a set of interfaces in a subsystem.

  • Future-proofing: As the project scales, these modules can be extracted into microservices without a complete overhaul.

 

Resourced Teams

Teams with more resources should:

  • Avoid the big-bang approach: Instead of a complete overhaul, start small. Netflix, for example, began its journey by focusing on a single microservice for its movie encoding system before expanding.

  • Grow architecture using event storming: Engage in collaborative workshops to understand domain logic and create a robust microservices ecosystem.

 

Brownfield Strategies: Embracing Incremental Change

In brownfield scenarios, the Strangler application pattern is a systematic approach, named after the Strangler Fig that gradually envelops and replaces trees in nature.

  • Refactor in phases: Identify less complex modules to transition first, such as separating the user authentication service.

  • Resolve dependencies: Ensure new microservices can communicate with the old monolith, similar to how eBay handled its transition.

 

Common Microservice Challenges

Regardless of the project type, several challenges must be addressed:

  • Initial expenses: Transitioning to microservices requires investment in new tools and training. Spotify faced significant costs in its early adoption phase but saw long-term benefits in scalability and team autonomy.

  • Cultural shift: Distributed systems require a different approach to collaboration and problem-solving. The team must embrace a DevOps culture, as seen in the transformation of companies like Target.

  • Architecture team dynamics: The architecture team must establish consistent standards across the new distributed landscape, as demonstrated by the Guardian’s move to microservices.

  • Learning curve: There's a significant learning curve, and organizations must invest in training. Zalando is an excellent example of a company that fostered continuous learning during its microservices adoption.

 

Conclusion: The Path Forward

Adopting microservices is not just a technical challenge; it's a strategic one that requires a cultural shift within the organization. It's about building an ecosystem that can adapt, scale, and improve over time. The transition strategies for greenfield and brownfield projects outlined here provide a structured pathway towards such an evolution, fostering agility and resilience in today's competitive landscape.