The API designer's toolbox of today contains just about anything you'd need to efficiently write correct and useful specifications, including tools like Swagger UI, Postman Collections and AsyncAPI Studio. But there is one curious hole in this toolbox. Current static analyzers, the softwares that automatically correct your API while or after it is specified, can only help you detect the most basic issues—like broken links and missing fields. Why? Because current API tools do not capture enough context to help you more.
What context are we talking about? How do we give it to our static analyzers? Exactly how can it help you become more productive? Let's take a dive down the rabbit hole and learn that in static analysis—context is king.
We Want Tight Feedback Loops
But let's hold our horses for a minute. What are we really trying to improve by making our static analyzers detect more errors? The answer is the design feedback loop. The loop can be illustrated like this:
In an ideal case, the feedback loop is not a loop, but just a line of three phases:
- You make your design.
- You implement your design.
- You integrate your design into a larger system.
Reality is almost never this tidy, however. The arrows looping back represent the detection of issues that can only be fixed if you to go back to an earlier phase. You detect these issues either while (A) exploring the environment your design will be part of, (B) producing your design, or (C) testing that your design works and verifying it against its requirements.
Any designer attempting to approach anything that resembles productivity is always going to try to loop back as early as possible. Why? Because the later an issue is detected, the more it costs to correct. This is actually quite intuitive. It's a bit annoying to move a pillar when it's a couple of lines on a paper, but a herculean feat of engineering to move it when it has been built and thirty floors are resting on top of it. The same phenomenon can, of course, be observed in API design. It requires less effort to change an API specification than it does to change an implementation of that specification. Changing it becomes even more of a problem if there are hundreds of systems depending on it.
How do you loop back earlier? By leveraging appropriate tools and processes. We are, of course, mostly interested in tools for static analysis at the moment, but a more holistic take on productivity would include more kinds of tools, as well as the human organization producing the designs. Version management, file formats, editors, reviewing, testing, and more should be considered.
But, let's stick to static analysis. How do you know that a static analysis tool is helping you loop back faster? It helps you detect issues faster than you, or your design process, otherwise would have been able to!
The More Context the Better
Alright, so I've convinced you that static analysis could be a good idea. But what kind of issues can it really help you detect? Let's look at the use case static analysis is traditionally associated with: detecting issues in computer code. In the below example, we define the variables x and y, add y to w and provide the result to the function send.
Now, let's pretend we are a static analyzer. What issues are possible for us to detect? It seems like the variable w is never defined. It could be implicitly defined, however. We can't see any definitions for the functions Math.pow or send, so we can't tell if they exist at all or if they are called with the right number of and kinds of arguments. Are we able to detect any issues? The answer is that without more knowledge about what the code means, we can't! If we assume that the above is a complete JavaScript source file, however, we can know that neither of w and send are defined, which are two issues we can detect, and that Math.pow exists and is called with appropriate arguments.
Static Analysis is only possible when the appropriate context, which we define as related and relevant facts, is known. The more context is available to the analyzer, the more kinds of issues it can be made to detect.
What Contexts Are We Missing?
Alright! Let's get to our API specification! Perhaps we are writing it in some modern Interface Definition Language (IDL), such as OpenAPI, Protocol Buffers or AsyncAPI. As we type the specification into our favorite editor, which is performing static analysis, some of our mistakes are highlighted with red squiggly lines.
What kind of issues can the analyzer detect for us? Today, the answer will typically be: (1) violations against the specification language (IDL) and (2) internal specification inconsistencies. Did you, for example, forget a parenthesis, fail to specify an operation name, or refer to a message definition that doesn't exist? The red squiggly lines will be there to tell you! Great!
But what if an API operation is never used? Or if you failed to follow the message naming convention of your project? Or if you made a change that breaks compatibility with an API version you published earlier, but you didn't update the version number appropriately? Unless you are reading this article in a few years from now, I bet no red squiggly lines are appearing!
As you might have guessed already, the reason for these being undetected is that each example requires your static analyzer to know of another context. To know what API operations are used, your analyzer needs to know what systems could be calling them. To know what the naming convention is for messages, the convention must be available and specified in a machine-readable format. To detect breaking changes, what versions have been published in the past must be known. Context is king!
API Specifications Are Not Enough!
The root of the problem of poor static analysis is simply this: you need to specify more than just your APIs in a machine-readable format, and your static analyzer must be able interpret those additional specifications. How about a format for specifying what systems will provide and consume what APIs? Such a specification could look like this would-be YAML file:
While the above example is too simplistic for real-world use, it illustrates the point. If you have one of these for each of your systems, you can suddenly detect if there are APIs no system will ever consume. You can also use it in production to determine if all required systems are present for your application to operate.
What else could be specified in this manner? Every structure of relevance from the interface level up to the requirements level. Sounds familiar? A lot of companies, of course, already do this! Some by producing simple diagrams in tools like Microsoft Visio or Mermaid JS, and others by constructing graphs in programs like Enterprise Architect or Archi that can be verified automatically. The problem is that the output of that work is rarely useful as input for a static analysis tool at the next phase of the design feedback loop.
And Now What?
So, static analysis could be great! Fantastic! But it still isn't! What do we do about it? How about you implementing a new and better tool? Or, if you don't want to get your own hands dirty, be on the lookout until somebody else makes the effort! At the Parus Integration Excellence Center, we take hard looks at the tooling and practices surrounding architecting and integrating component-based computer systems. Our goal is to chart the path to a future of better system designs. Walking the path is up to you.
Did you find this post interesting? Do you vehemently disagree? Did we forget to mention your extraordinary tool the world somehow hasn't discovered just yet? Great! Follow and discuss with us at Parus on social media. If you are in the neighborhood, come to one of our events!