Thursday, October 3, 2024

The Unwritten Laws of Code Reviews: Lessons Learned the Hard Way

When I first got into the habit of reviewing code, I thought I had it figured out. After all, code reviews are just about making sure things work, right?

How little did I know how wrong I was.

My attitude to code reviews wasn't just inefficient, it was damaging: it irritated colleagues, fomented resentment, and delayed feature launches, all for failing to stick to some really simple yet powerful unwritten rules of code reviews.



Here are the 4 rules I learned the hard way:

1. Block a PR Only If It Requires Your Approval

We've all been there: you click "Request Changes" because something seems off, or you don't like some implementation detail. It sounds harmless-after all, it's just a request, right?


Wrong.


Where you "request changes," that action often halts the entire process. And let's be real: there's a reason it's written in red.


So unless the change being requested is literally the only thing preventing you from advance with a bug or security vulnerability or something imperative, don't block the pull request. 


Suggestions are well and good, but over-zealously blocking PRs gets in the way of frustrating your peers and slowing them down.

2. Show Don't Tell

I have no idea how much time I wasted typing away explaining changes with long paragraphs when a code snippet would do.


Instead of writing "You should refactor this section to be more DRY," you can simply write the refactored code and paste it in the comments. It's quicker and easier, and helps the developer understand your point straight away. Code is the common language we all understand, so use it to your advantage.

3. Give Critical Feedback In Person (or Via a Slack Huddle)

There is nothing worse than getting back a five-paragraph essay tearing your code apart. Code review platforms are great for comments, but can often strip out much of the nuance that a conversation provides.


If you need to provide critical or serious feedback, it's best to talk about it in person or through a quick Slack huddle. That way, it's much easier to convey your tone and avoid misunderstanding. Moreover, with a direct yet respectful conversation, you are less likely to bruise an ego. 

4. Review the Big Picture First

It's tempting to dive into the details-after all, nitpicking code is what we love doing! But before you get into the small stuff, take a step back. Look at the high-level changes:

  • Are there any API changes?
  • Is the database schema being modified?
  • Do the changes impact overall architecture?

Focusing on the big-picture stuff first helps to identify the potential blockers earlier on. As for the small stuff - those minor refactors, naming conventions, or unit tests ,they often can be cleaned up later.

As a matter of fact, those little details could well change based on larger discussions about the big picture anyway.

The Most Important Rule: Everyone Has a "Nit Limit"

Everyone has a limit in terms of how much nitpicking they are willing to tolerate. We all want to improve our code, but we all reach a stage where too much feedback turns into resentment. People won't always be open about their feelings of frustration, and they might not even be aware that it's happening in their brain, but the more nitpicks you do, the less they'll care about your feedback.


So, pick your battles. Don't die on a hill for that one extra line of spacing or whether to use a const instead of a let.


When you approach code review, bring curiosity and empathy. Try to understand why the author decided on one thing over another. Most of all, this holds true if you will be the maintainer of this code later. Instead of "this is wrong," start off by "why." Understanding of trade-offs is key in software development.

Conclusion: Give Helpful, Respectful Feedback and Everyone Wins

Code reviews are about so much more than finding mistakes at the end of the day. It's about collaboration, improvement, and making sure we're all building something maintainable. If done right, code reviews can build trust, solidify team cohesion, and ultimately drive better results for us all.

And don’t forget - always squash your commits.


Tuesday, September 10, 2024

Debunking Common CQRS Myths

CQRS (Command Query Responsibility Segregation) is an architectural pattern that focuses on separating the handling of commands (write operations) from queries (read operations). While simple in concept, numerous myths and misunderstandings surrounding its use can lead to unnecessary complexity, especially in terms of system architecture and user interface design. Let’s address these common myths, layer by layer, and clarify what CQRS really involves.



Myth 1: CQRS and Event Sourcing Are the Same Thing

One of the biggest misunderstandings is that CQRS always involves event sourcing, or vice versa. While these two concepts are often used together, they are not dependent on each other.

Event sourcing focuses on storing every state change as an event and rebuilding the state by replaying these events. This can simplify building read models over time in event-driven systems. However, you can implement CQRS without using event sourcing at all. Likewise, you can use event sourcing without applying CQRS. The two are distinct architectural patterns that, although complementary, serve different purposes.

Event sourcing is generally more useful when working within bounded contexts and dealing with systems where historical state tracking is important. However, the overhead of managing event sourcing can be significant, and it shouldn’t be applied unless there is a clear need for it.

Myth 2: CQRS Requires an Eventually Consistent Read Store

Another common misconception is that CQRS mandates the use of an eventually consistent read store, where the results of a command (write operation) take some time to reflect in the query (read side). This is not a requirement.

Immediate consistency is entirely possible in a CQRS setup, where the read model is updated as soon as the command succeeds, all within the same transaction. In fact, in many existing systems, transitioning from an immediate to an eventual consistency model can add unnecessary complexity and confuse users who expect instant updates. It’s often easier and more effective to start with immediate consistency and gradually shift to eventual consistency where it is genuinely needed, rather than forcing it upfront.

Transitioning to eventual consistency should be a gradual process, especially when user experience and expectations are at stake. For example, if users expect instant updates after they submit a request, suddenly shifting to eventual consistency could frustrate them unless the underlying business processes also change to accommodate this.

Myth 3: CQRS Requires a Message Bus, Queues, or Asynchronous Messaging

A lot of people mistakenly believe that implementing CQRS means you need to use message buses or asynchronous messaging systems like NServiceBus. This isn’t the case.

While asynchronous messaging systems can be useful for handling eventual consistency in more complex scenarios, there’s nothing in CQRS that explicitly requires this. You can very well implement CQRS without any form of messaging infrastructure. Whether to use queues or a message bus depends entirely on the consistency requirements and scalability needs of your system.

The takeaway here is to avoid unnecessary complexity at the start. Don’t introduce queues or a bus until you know you need eventual consistency, or have proven that your system benefits from asynchronous processing. Immediate consistency with simpler infrastructure might be sufficient for many use cases.

Myth 4: Commands Are Always Fire and Forget

Another common myth is that commands in CQRS are inherently fire-and-forget, meaning that after a command is issued, there’s no need for feedback to the user. In practice, this is rarely the case.

Most business operations require at least a basic level of confirmation. Users need to know if their request was successfully received and accepted. While the actual fulfillment of the command can happen asynchronously, the acceptance of the request should typically be handled synchronously. This can be as simple as providing an acknowledgment message that the system has registered the request.

In scenarios where fulfillment takes time (e.g., processing payments or large data operations), you’ll likely need to introduce processes like sagas or workflows to handle long-running tasks and provide updates to the user over time. Fire-and-forget is generally too simplistic for real-world business needs, where feedback and request correlation are critical.

 

Myth 5: Read Models Must Be Eventually Consistent

Many assume that the read models in CQRS must always be eventually consistent, where the results of write operations don’t immediately reflect in the read view. This assumption is misguided.

Read models only need to be eventually consistent when the business requirements demand it. For many systems, immediate consistency is a perfectly valid approach, especially when users expect real-time feedback. Before deciding on eventual consistency, you should carefully assess whether delayed updates will affect the user experience and how your system can handle failures and delays.

Switching to eventual consistency means introducing a whole new set of challenges, like handling failed updates to the read model, or figuring out how to manage the user experience when data isn’t immediately available. You need to ensure that your system can gracefully handle these scenarios, or else you’ll likely encounter more support issues than before.

Myth 6: CQRS Solves Consistency and Concurrency Issues

There’s a false belief that CQRS automatically fixes issues related to data consistency and concurrency. This couldn’t be further from the truth.

In fact, if you try to handle all commands in a strictly serialized manner to avoid concurrency issues, you might end up with performance bottlenecks. CQRS doesn’t eliminate concurrency problems; it simply shifts them. On the query side, you also have to deal with potential out-of-order events, duplicate events, or event failures. Denormalizing read models to handle such situations is possible, but it still requires careful design.

CQRS won’t let you escape these challenges, and it doesn’t automatically lead to scalable systems. You still need to address concurrency and consistency in both the command and query sides of the architecture.

Myth 7: CQRS Is Easy to Implement

Despite its conceptual simplicity, CQRS is far from easy to implement in practice. The separation of concerns between commands and queries may seem straightforward, but many implementations fail because of a lack of understanding of the business domain.

CQRS doesn’t replace the need for a deep understanding of business requirements. It might help in organizing and fulfilling those needs more effectively, but it doesn’t guarantee success. You can still build the wrong system with CQRS if you don’t fully grasp what the business truly needs.

Replacing legacy systems with a CQRS architecture also comes with significant risk. A complete rewrite is always dangerous, and the mere presence of CQRS doesn’t mitigate those risks. You’ll need to think through these transitions carefully, keeping business priorities in focus.

 

Myth 8: CQRS Requires Separate Databases

One myth that needs to be dispelled is the idea that CQRS requires separate databases for handling commands and queries. This is not true.

CQRS does not require the use of separate databases. What it mandates is separate object models for handling commands and queries, but these models can reside within the same database. You can split the models based on their responsibilities without having to create two separate databases.

That said, using separate databases can be beneficial for performance or scalability reasons, but it’s entirely optional. The core of CQRS is about separating the responsibilities, not necessarily the physical data storage.

Myth 9: CQRS Always Requires Separate Models for Reads and Writes

While CQRS enables you to create separate models for reading and writing, it does not always require this approach. In simpler systems or early-stage implementations, you might still use a shared model for both reads and writes, gradually transitioning to separate models if the business demands it.

The power of CQRS lies in its flexibility. It allows you to optimize each side (command and query) independently, but it does not impose rigid rules about how that optimization must happen.

The Takeaway

CQRS is a flexible and powerful architecture for separating the concerns of commands and queries, but it doesn’t come with the rigid requirements that many assume. It doesn’t mandate eventual consistency, separate databases, or event sourcing, and it doesn’t solve concurrency issues by itself. Above all, CQRS should be applied with a clear understanding of the business’s needs, and its complexity should only be introduced as required. Keep your implementation simple and build complexity only where necessary to truly meet the goals of your system.

 

Monday, September 2, 2024

The Basics of RESTful API Security: A Beginner's Guide

A major portion of the applications in today's digital, networked world has been developed based on the backbone of a RESTful API to pass information between different software systems. As a software developer, security becomes an important thing to take care of. The following beginner's guide will take one through the key concepts of RESTful API security: authentication, authorization, encryption, and data validation.



Understanding RESTful API Security

Since the RESTful APIs are designed to be stateless, the request made by any client to the server must contain complete information to allow the execution of an operation. Designing for API RESTfulness is very convenient, but it also results in some potential risks in terms of security: unauthorized access, breach of data, and manipulation of sensitive information.

1. Authentication: identity of user

Authentication involves verifying the identity of a user or system trying to gain access to an API. An authentication layer is a measure that presents the first line of defence for APIs and strives to lock out ill-intentioned users from making requests against the API.

    Common Authentication Methods:

  • API Keys: This is a very simple solution where the client gets a key in particular that he includes in the header of the request. However, not secure at all—in fact, it is very weak—because if unencrypted, this key could easily be intercepted.

  • OAuth 2.0: Likely the most implemented protocol, allowing a third-party application to obtain API access for an end-user without supplying credentials. OAuth 2.0 is token-based, so it's way more secure and flexible in how authentication can be done.

  • Basic Auth: This is the base64 encoding of a username and password that gets sent with every single request to an API. It should, thus, only be used over HTTPS, as the encoding will easily be decoded if intercepted.

2. Authentication: Access Control

Authorization is the act performed after authentication, which decides what the user who has been authenticated may do. Said differently, once authentication confirms the identity of the user, authorization looks at whether he is authorized to carry out a certain action or access particular data.

    Implementing Authorization:

  • Role-Based Access Control: Users are assigned different roles, and the role would include specific permission. A very common example is an admin being able to access all the API endpoints, while a regular user can only have a limited set of actions.

  • OAuth 2.0 Scopes: The allowed actions of an Access Token are defined by its scopes. A scope is a constraining of the set of actions that can be performed using an access token. Example: A read-only token to user data.

3. Encryption: Data Protection

 Encryption is an essential tool in data protection, both when it is at rest and when in transit. It surely makes data, which falls into the wrong hands, unreadable.

    Encryption Methods:

  • TLS: Data travelling in between the client and server is encrypted. Theoretically, this could prevent an attacker from reading the data by catching it in transit. Always use HTTPS-HTTP over TLS when communicating with APIs.

  • End-to-end encryption: data is encrypted on the client side and decrypted only on the server side, hence making it impossible to steal data when intercepted in transit.

4. Data Validation: Ensuring Data Integrity

Validation: The data is checked at the server to ensure that it comes from the client without malicious data and is complete in the right way. This is an important step for validating user inputs against SQL injection and cross-site scripting, among other manipulations.

    Best Practices for Data Validation

  • Input validation: Ensure data that is received at the server is validated properly. Validate the type, format, and length of input data. 

  • Output Encoding: Encode the output data to prevent injection attacks. This is specifically important if the output data is being pumped back into a web page or in the database query. 

  • Schema Validation: Another way they provide validation is in the structure of incoming data with JSON Schema, which ensures conformance to the expected schema. 


Security considerations when using RESTful APIs touch many layers: they start at the beginning with authentication and authorization, go on to encryption, and end with data validation. These basic security practices will allow a person to build strong APIs that protect sensitive data and ensure access to services only for people meant to be using them. As you go deeper into developing and scaling your APIs, consider adding additional advanced security features, such as rate limiting, whitelisting IPs, and regularly running security audits. A proactive approach to security means you will set the bar high in terms of users' trust in your applications.