API Testing: from Entry Level to PhD
Today we’re gonna be talking about API testing and we’re gonna do so and at the same time try to avoid making any sorts of assumptions on experience level. So we’re really going to start really kind of on a ground level and work our way up and what that means is that we’re gonna kind of take a look at what an API is in a web context specifically, what they look like, how they’re constructed and how they work. And then from there, we’re going to kind of take a look at some times that we at API Fortress kind of saw them break down and the consequences of those issues and then we’re gonna follow that up with taking a look at what we consider some, just like some generally solid practices for API testing in general. And then if we have time we might take a peek at a couple of other things but that should fill out the time pretty well.
So let’s go ahead and dive right in.
What is an API anyway?
Let’s take a look at what an API is, right? So the API stands for Application Programming Interface and in a different context this can mean a number of different things. But specifically when we’re talking about APIs in the context of web development we’re talking about kind of a content delivery method, we’re talking about a mechanism for moving formatted data from one place to another. So that can be either from a database to a server to a client or between microservices. There are a lot of different kinds of transactional schema that we see in the scope of APIs in general but really the most core definition that we can really kind of come to is that an API is something that exposes business logic to the outside world. And when we’re talking about business logic we’re kind of talking about anything that lives under the hood of an application so be it something that presents us with user data or anything like that. An API is what allows us to go to that application and get that data without actually having to go under the hood of that application. We can just make this request and receive our formatted data back.
What does an API response look like?
So what does an API response actually look like? So in one case, and this is probably the most frequent case, API responses are JSON objects. So this is what we’re looking at here. A slightly less frequent but still very common case is XML and that’s what we’re seeing here at the bottom and in the first case that would typically be the response to what we refer to as a RESTful API. In the second case that’s often the response to what we refer to as a SOAP API. And these are just kind of different schemas and protocols that rely a lot on statefulness and things like that which we’re gonna get into a little bit down the road.
How are API responses formatted?
Benefits of REST
So we talked about REST a moment ago so let’s take a look at some of the benefits of REST versus SOAP because those are kind of the two principle schema that we’d see in the context of a web API. So REST is that JSON object that we were looking at a moment ago, but it also talks about the type of routing and things like that, kind of an organizational construct that we can use in terms of our API. So the first, and I think most heavy, benefit of REST is that browser clients in general are more supportive of REST. It makes up approximately 70% of the current API environment at large so it is extremely common and really well supported through most browsers. REST APIs are also frequently more performative so serializing and deserializing things like JSON objects is actually a kind of a memory-friendly operation and it doesn’t typically overwhelm your browser or anything like that. You can also see more types of data formats under the hood of a REST API typically than are allowed in the scope of a SOAP API and just kind of at a really low level JSON is typically a little bit more friendly formatting style than XML is. That isn’t to say that SOAP is without benefits though.
Benefits of SOAP
So with SOAP APIs we typically require less coding in the application layer for things like security and trust. And that’s because a lot of the time SOAP has fixed definitions that are sent as documents along with it. As a further consequence of that, kind of having fixed definitions sent along with the SOAP document, is that we have greater transactional reliability. So if we need something like ACID compliance, which is a pretty frequent thing in data transactions, we would typically want to at least consider using a SOAP schema for our API. And then again because of kind of this data reliability thing that we have with SOAP it’s typically simpler to get it to perform and cooperate through firewalls and proxies without having to actually make low level modifications to the protocol itself, so if we’re building a REST API typically the thing that we’re looking at is a complication with regards to things like security and stuff like that is that we have to define those things by ourselves. Whereas with SOAP a lot of the time the protocol itself has mechanisms for dealing with security firewalls, things like that. And then finally SOAP is highly extensible, by passing different kinds of what are referred to as namespaces and things like that to the XML itself we are able to kind of pass codified uniform sets of rules into the API itself and make our security issues a little less difficult to solve.
The Technical Bits
So now we’re going to kind of step into the technical bits and get under the hood and kind of see how an API technically works. Don’t be afraid. It definitely behooves us to kind of take a look at how everything behind the scenes is working
Q: Hey Jason there’s a question that came in that basically said what is ACID?
Oh great. So ACID compliance. Yeah. So it’s Atomicity, Consistency…oh geez… I’m gonna embarrass myself and I won’t remember what the letters exactly stand for, but basically as a compliance it is a mechanism for ensuring kind of data compliance in databases specifically. So we want both data security and transactional security. And what that means is a lot of the time you know if I’m updating …. if I have a transaction that updates multiple fields in my database I want to make sure that that transaction either completes fully or is wholesale deleted. So that’s a part of it. I also want to make sure that each individual transaction is broken down as far as possible. I want to make sure that my data is consistent. So as a compliance kind of … as a whole it’s just a mechanism for maintaining data integrity in the scope of a database.
And I hope that answers your question. I did a very poor job of it though because I don’t remember what ACID stands for exactly … before the end of this I promise I will look it up and get back to you.
According to Wikipedia it is Atomicity, Consistency, Isolation, Durability.
Yeah that’s exactly it. But really at its core ACID compliance is kind of just about data integrity in the scope of … I shouldn’t say database, really any sort of data storage. As compliance is extremely important in that regard and a SOAP API makes that a little easier to maintain because a lot of the things that are happening under the hood aren’t necessarily defined by the person writing the API because the SOAP protocol itself kind of has mechanisms for that.
What does an API look like under the hood?
So let’s take a quick look at kind of what an API looks like under the hood kind of graphically, so we’re gonna look at the primary components that constitute this whole transaction amongst the API. So we’ve got our client which is your computer. We’ve got our server and we’ve got our database so the first thing that’s gonna happen typically is the client is going to make a request of the server. So the request is leaving the client and the request is typically formatted in a similar manner to the responses that we were talking about. So that’s going to be a JSON object or an XML document, just something that can be transmitted over HTTP, HTTP being Hypertext Transfer Protocol which is the kind of the line over which most of this communication is happening. So that request is gonna be received by the server… Can you see my cursor I believe you can… So when it gets to the server that request is going to be applied to the correct routing typically and that routing may, or may not, require data from a database. So if it does the server is then going to issue a query to the database which is kind of another type of structured request. That request is going to ask the database for example if my request is to log in I will be sending my user credentials to the server and then the server is going to pull the database for what it knows to be my user credentials and then within the scope of the server as this transaction kind of turns around and made my query to the database, the database is kind of sending data back to the server, the server is going to compare what I passed as my user data to what it knows to be the actual, accurate user data and if it’s correct it will generate a response stating yes this person can log in, if it’s incorrect the server may generate a response saying that this person is passing invalid credentials and cannot log in. So this is kind of just like the generalized workflow, the back and forth of the API and at a sort of a low level this can be true of almost every API. Now there will certainly be some exceptions out there. But with regards to data flow this is what it’s going to look like.
No, no. UNDER the hood.
And now we’re gonna go a step deeper under the hood and we’re going to take a very brief look at a very small piece of code and we’re just gonna kind of talk about what the different pieces of it means. It’s not super critical to understanding APIs as a concept, but I think it’s a good idea for us to kind of take a look at what’s happening here exactly on a code level.
Jason, we have a question for you…The code example is really good. It’s real simple. And you mentioned a GET method. I don’t know if you gonna talk about this later but if not, do expand a little bit on that definitely.
That’s a really terrific question. So when I say GET method, in REST we have a number of methods. Some people call them verbs. Some people call them methods. But primarily we have the ones that are currently used most commonly would be GET, POST, PUT, PATCH and DELETE. There are a few others but they see kind of less frequent support. The most important thing to remember is that these method names while they are supposed to be for specific functionality they are not kind of tied to that functionality. So when you produce a GET request, like this one that we’re looking at here for example, the normal behavior of a GET request is to get data, whereas a POST request for example would be to create a new database entry,… a PUT request would be to update the database entry, a PATCH request would be to change the database entry and a DELETE request would be to remove a database entry. Those are kind of the normal uses of those individual verbs or methods. It should be noted though that there’s kind of nothing under the hood of any server architecture that prevents you from, for example, having a GET route that deleted a database entry or deleted some data. So they’re more kind of generally accepted definitions than kind of hard technical definitions, but those are what we refer to as REST methods.
Q: Can you repeat those again?
So GET would be to kind of retrieve data, POST would be to add new data, PUT would be to take an entire database entry and update the entire thing, but it’s a database entry that already exists. PATCH is similar to PUT, but rather than updating the entire entry, you’re just updating a part of it and DELETE would be removing a database entry.
Q: So is PATCH similar to UPDATE? … actually synonymous with updating?
That’s correct. So PATCH and PUT actually they can both be considered an update.
One of them takes an entire,… kind of updates all of the data in an entry, whereas PATCH would only update a couple of fields.
You sort of mentioned right there that those verbs are really specific for REST APIs. So are there differences in the verbs that you use between like REST and SOAP?
So SOAPs, SOAP APIs are largely constructed differently. The kind of the mechanisms are the same. The RESTful methods are kind of their own thing. The vast majority of my experience is amongst RESTful APIs. SOAP APIs are something that I am a little more hesitant to comment on with regards to actual definitions of methods. So, sorry, I guess that would be the plainest thing I can say there. I don’t want to give misinformation though.
When the Machine Breaks Down
So now we’re gonna move into a section of the presentation that we refer to as ‘when the machine breaks down’. And this is, as I mentioned before, kind of a few case studies where we see APIs just kind of going sideways for a number of different reasons. And these will kind of trace directly into the kind of practices that we’re gonna talk about towards the end but they’re all kind of interesting cases in their own right. So the first one is called, what we call the case of the missing category.
Case 1: The Case of the Missing Category
So an organization, a retail organization with over 50,000 items in inventory and I apologize in advance but these are all going to be anonymous because they have to be, because they are orgs that our company deals with on a professional level. And we do have permission to talk about these situations but we don’t have permission to use their names in the scope of them. So they were very heavily into UI testing, and for any of those who are not familiar with UI testing that would be something like something driven by Selenium or something like that where you’re simulating clicks and then comparing screenshots and things like that with expected results screenshots. So they were not seeing any errors during UI testing because their UI was rendering properly, … when they started running API tests they were seeing over 3,000 errors on every single run of their products endpoint. And what was actually happening was actually fascinating. Under the hood they have a category column and as a data entry error essentially over 3,000 products were entered with a NULL category. So when we tried to execute their API by category all of those NULL pointers provided nothing back. So they had over 3,000 products in the scope of their API that were never returning in one of these searches, which actually was an API error that wound up costing real dollars. So kind of the object lesson here is that UI testing is an important practice and it’s super important to be consistent with, but UI testing by itself will very infrequently show you what’s actually wrong. It’ll just show you that there is an error, whereas API testing, kind of going a level below that to some of the data that is being delivered to be rendered on the front end, is more likely to kind of show you precisely what the problem is rather than that there’s just the problem with rendering the UI.
So to sort of reiterate that, that’s a really good example of, and I mean I don’t know if you use this example very much but you know when you talk about or at least on the testing space we look at like the test automation pyramid. And this is sort of the difference. You sort of describe that’s a very good, very great example, of you know when you’re testing at the UI you’re looking for the functionality, but what you just described was a data error, right? Like you’re missing products and you wouldn’t be able to determine that checking the functionality from the UI.
Exactly. Yeah. When they made, you know, either an all-products call or an individual-product call it was all showing. But this specific organization actually did a ton of business through what they referred to as their categories. So an individual selects a category and that renders a number of products. In that specific case because a number of products had no categories those products were literally never rendering to the viewport when those sorts of searches were executed. So they were seeing a substantial differentiation between the revenue generated by those products with no categories versus the products that had accessible categories. Now that caused an actual revenue issue for them because it was not an insignificant portion of their products.
Case 2: The API that Always Mounted
The next case we’re gonna look at … I always enjoy this one. We referred to it as ‘the API that always mounted’. So this is a particularly unusual kind of situation. This was another large regional organization and they have an extremely diverse set of products. Every single product in their inventory was returning with mounting instructions. So we had shelves that could be mounted, we had televisions that could be mounted but we also had drinking cups that could be mounted and ski pockets that could be mounted… Their current API testing tool was incapable of accounting for different schemas being returned by the same API. So that is to say one API endpoint that, depending on either product category or something like that, either had or lacked specific key-value pairs, … in this case mounting instructions, which should not have been present in most of these products. But upon updating their testing scheme and introducing conditional testing logic that checked for a category and then checked for the presence of a mountable status we were able to get the API to stop incorrectly reporting
Case 3: François and the CharSet
In programming in general, and one of the very important things we define when we are creating an API, is the charset that it’s going to accept and this is literally the physical collection of machine and human readable characters that will constitute what is considered an acceptable response from that API. So it’s literally the set of letters and numbers that the API considers valid. So again an API typically has a defined charset and they do not really, very rarely, include every available character under the sun and that’s going to come into play in just a moment. So the organization in question did not have a charset that they were using that accounted for what’s known as a cédille and that’s actually this little guy at the bottom of the ç in François. So what was happening was anytime a user signed up with a name that had a cédille in it that cédille would be replaced with a question mark . So anytime it appeared and was submitted into the API for processing it was replaced with a question mark on a ‘write’ operation. And the reason for this is that the character set that the API was equipped to understand did not include this character. So upon receiving a character that it does not recognize it replaced it with a placeholder, which in this case was a question mark. So in this specific case François became Fran?ois. So now, and this is kind of tongue in cheek here at the bottom that … anything that referenced this user’s name against the database would blow up. But this is actually a really legitimate concern and a serious problem because anytime we’re executing an operation, be it an API for authentication or something like that, what’s typically happening is it’s receiving some data from the client or the user, that would be us, and it’s comparing it against known data. So it’s pulling the database looking for that individual and then doing a comparison. But the problem here is that any time somebody entered the name François or anything with this character, or any unrecognizable character, when it was compared against what was machine-writable, which in this case was the thing with the question mark, those two will never match. So that individual will unfortunately never be able to log into the system.
This is kind of a perfect example of a loss of data integrity and consistency. This would actually be very specifically not ACID-compliant because we’re not even able to generate an entry in the database that is functionally correct.
Case 4: No you broke it
So what we were looking at here were two relatively siloed teams in the same organization and this is an organization that takes images and does things like puts them on coffee mugs, things like that, pretty well-known org but they have one team that is specifically responsible for writing the API that processes image data. And then there’s another team that is responsible for the API that processes print orders, and what happened here is actually definitively human error. One team, both of which are very siloed from each other, changed the schema of the output of their API. As a consequence of that, the print order processing team is no longer able to utilize the API delivered from image processing. So this is kind of a fundamental breakdown in communication … but the kind of thing that we’re driving at with this specific kind of case is that APIs can be delicate and we as the stewards of those kind of pieces of technology have to be extremely cognizant of the fact that, while the machine is way better at doing math quickly and things like that than we are, it’s very bad at adjusting in most cases and if we make kind of the human input change that would result in API output changing dramatically we have to make sure that the team on the other side of that API transaction is made aware that we made that decision. Otherwise things will break down and the server or the API itself is not kind of at fault in this specific case. This is 100 percent human error but it did turn into a pretty substantial problem for this organization.
And next we’re going to kind of take a look at some practices that we over on our team find to kind of work well for us in terms of testing our actual APIs.
Rule 1: Keep it DRY
So the first rule that we really kind of hold fast to, is keeping it DRY.
DRY is a pretty popular programmatic concept and it stands for Don’t Repeat Yourself. The key here is that we can parameterize data and code wherever we can. So that means we’re going to … instead of kind of hard coding things into our tests, like usernames, passwords things like that, what we want to do is pass kind of definable parameters and leverage things like classes and things like that in order to make our tests more flexible and more usable in different spaces. In doing so we increase the modularity of the work that we’re actually doing and that allows our work to be leveraged by other people on our team. And we want to create reusable tests so that we’re not repeating the work that we’re doing over and over again.
Rule 2: Make Your Intentions Clear
The next kind of rule that we try to stick to, and this one is a lot more ephemeral, a lot more human than the previous one, the previous one is a very much programmatic kind of hard rule ‘Keep it DRY, Don’t Repeat Yourself’. This one is ‘make your intentions clear’. So typically when we’re testing we’re working in the context of a team. And while it’s sad it is also true that we don’t always stay with an organization forever and sometimes we don’t even stay with an organization through the entirety of the software development lifecycle. So the first part about making your intentions clear is really super human and super on the ground. How am I gonna remember what I wrote this test for and I think a lot of the time when we’re writing code, or we’re writing tests, a lot of the time we’ll walk away from it and we’ll come back to it a few months later and we won’t remember what we actually did specifically for it. So the key here is documentation. So you know, make lots of notes, leave lots of comments amongst your code, or your test code and keep good notes for yourself because again, you know, not only how am I gonna remember what this test is for, but how is the next developer/tester gonna know what the specific test is for. We need to make ourselves as clear as possible at all times. If our test can’t be reused, one of the results there is that we have to do increasing amounts of work. We’re going to have to rewrite the same test if we forget the intention of a specific line of code or test code that we wrote. One of the consequences of that is that we’re going to have to rewrite it we’re going to have to do that work again. And finally collaboration is incredibly difficult to achieve amongst the scope of a team when the intent behind the test is obscured either behind code that is not readable or code that has not been properly commented or documented.
So Jason you said properly documented or well documented,… I mean that’s always a challenge, right? And so you mentioned being able to add comments to your code. Are there other good ways that you found, to document the APIs, document the intentions, either like a Confluence page or something along those lines ?
So Confluence is a really terrific example of that. Another thing that we’ve had … so specifically coming from the API space there are a number of really great options. So one of them that’s really well known would be Swagger, open-API documentation Swagger recently became the open-API thing. But that’s kind of just this document of an object that defines an API. But even at a kind of a more human readable level than that, you have things like Apiary for example which provide solutions for really kind of fantastic interactive API documentation. Swagger UI also, when you pass a swagger into Swagger UI you get this really beautiful visualization of your API which makes it much more readable. You can socialize it much better amongst your team. So yeah that’s a fantastic point.
Q: And then there’s a question that came in :
Can you use TDD test driven development, BDD Cucumber for documentation, do those tend to work well, or do you see more like inline comments as opposed to the Cucumber Gherkin style?
Cucumber and Gherkin are tremendously human readable so I would definitely advocate for that in that regard. Inline comments are also terrific. I think they both have their place; personal opinion. So if I’m writing an API, I’m writing the kind of code that we were looking at a little while ago, I’m definitely going to want to involve inline comments and things like that. You know there’s this joke amongst developers or amongst tech teams in general, that anytime somebody says that they have what people refer to as quote-unquote self-documenting code, you know, the kind of code that’s human readable, it just means that they were too lazy to comment it. There’s really, there’s almost never a time where there’s no, there’s an excuse not to leave inline comments in your code. That being said, the notion of creating things like Gherkin or Cucumber documentation, which is super human readable, is also a very good idea. I’m very much an advocate for documentation and commenting. I personally believe that there is never enough. So I’m just gonna go ahead and recommend that you do both.
Rule 3: Act Like the Consumer Would!
So the next thing that we’d like to adhere to is, and this is again kind of specific to API testing not just software testing as a whole, but it actually does trace back to software testing as a whole is acting like the consumer would. Now when we say consumer I don’t necessarily mean the end user. This could be perhaps, you know, two microservices that are communicating with each other that we’re talking about. But what we really want to do is we want to stop creating test cases in a vacuum. The kind of test cases that have vacuum stuff that we’re talking about here is more what I refer to as appropriate for unit testing. When we’re talking about testing an API, at the API level we’ve reached a level essentially where a number of functions are executing in concert, where we are mutating data or changing data in a pretty substantial way. So doing so in a vacuum doesn’t really work. We want to use real user data whenever possible, we want to perhaps use a cloned database instance or something like that to really demonstrate how this API is going to perform in the real world, instead of just kind of pinging it and hoping that the response comes back in a machine readable way with data that doesn’t really qualify as what would be real user data. Like I just said, most APIs kind of exist in the scope of the system. So if we’re using or trying to test an API, that typically relies on data that’s provided by yet another API, perhaps in a layered architecture, or a microservice architecture, what we really want to do is take the data that would be produced by that other microservice, or other API and pass that into this API to generate as real of a result in as real of a situation as possible. It gives our tests more meaning. Testing individual parts of the whole is very important, but again that’s kind of more of a unit level of testing. At the API level what we really want to do is we want to test full workflows. And simulating actual user workflows is super important to kind of find the whole so rather than just testing for example my log-in workflow and making sure that it returns a token I want to hit my log-in workflow, I want to grab that token, I want to pass it to another kind of, another route that requires that token and make sure that the data that comes back from that second route is correct. Otherwise I’m just demonstrating that I have a token. I don’t know for certain that that token is valid, nor do I know that that second route in the chain can actually accept that token and process it.
Rule 4: Eliminate Static Data Sources!
And our fourth rule is eliminating static data sources. Static data, in the scope of API testing specifically, is very rarely your friend. Now, as a caveat to that, some endpoints do often rely on static data and that’s OK.
Can you go into what static data is?
So when we say static data, what we mean is kind of just data that only exists for the purpose of this test. This kind of traces back into the previous rule, or previous idea that we want to use kind of live data whenever possible. And obviously we don’t want to deploy code that’s still being tested to a live server. But what we want to do is we want to create either mock endpoints that simulate a live endpoint that this endpoint depends on, or even cloned instances of databases that we can actually hit for the real type of data that this test or this API is going to require to execute properly. A lot of the time what we see is kind of bespoke data being constructed specifically for the purpose of the test and being passed into an API and operating under the assumption that that is going to function just like it would in the real world. But one of the unfortunate facts that we’ve seen is like with things like canonical data formats and things like that, occasionally if a testing team isn’t using the actual data that the API will depend on, then we kind of get these false positives back in tests, whereas when we use the actual database, the user database, the product database, or whatever it may be with this API for the purposes of testing we actually get back much more meaningful data results.
And then finally, where live data is unavailable we want to use the mock responses that kind of match our expected responses, so I kind of mentioned this in the previous explanation. Those mock responses are, you know, any sort of mocking service. I mean API Fortress has one, there are a number of others out there, or you can just kind of build your own mock server, inside of your environment stand up an Express server, a Django server or something like that. That’s going to provide something of the same format that you’re looking for but actually provide it over HTTP so you can really test to see, you know, that everything is working right. That my ability to send and receive data is working properly on top of kind of a lower unit level test where I can see that these individual functions are receiving the data that they need and executing properly upon that data.
Q: You just mentioned Django server. What was that?
With that said Chris, that’s kind of where the presentation formally comes to a close. I’d be more than happy to field questions.
What’s the best way to describe the difference between like a REST and SOAP API. This is a little bit more towards the basics, but like are there a good set of differences that you can really use to separate the two?
I guess functional differences rather than structural differences would be: REST is typically a little more developer-friendly. You can kind of play with the structure of things a little more. It’s a little bit more plastic whereas SOAP tends to be very rigid. There is a benefit to that rigidity though insofar as when it comes to security and things like that, so can often make some things a little bit easier on the developer. From a testing perspective what you’re going to see as a primary difference, is the differences between navigating an object and navigating an XML document, that’s going to kind of come down to which tool you’re using. So with API Fortress for example, where I am, you navigate the two of them exactly in the same way, but with some tools it can be a bit more complex because of the XML documents you’re kind of … as a matter of course more deeply nested than JSON objects typically are. But just from a structural perspective this is what a REST response typically looks like, this is what an XML response typically looks like (see earlier slides).
How does a tester determine the expected response to like a mock response or I guess even a live response?
This is a less kind of firm answer….what I’d say is, this traces back to the question of documentation. As long as we’re making sure that our API is documented and documented well, then we very frequently don’t have to ask that kind of question because I know that the API that I’m working with depends on the response from this other API. I can just go straight to the documentation for that API and whether it’s, you know, Swagger, Postman Collections or IO Docs or Ramble, or whatever it may be, or Apiary one of the graphical ones, I can physically see that object there, I know what I’m expecting and I can use that literally, just, with a good mocking platform, copy paste it into a mock route and have that mock route just issue that response and then if I chain that mock to my actual endpoint then we can actually use that proper data for generating our test results.
This is a really good point. And so if you’re starting out new with this, you do have to talk to a developer or somebody … you like, you need to find that documentation and I’ve worked at a lot of companies that just don’t have that documentation and you’re either left to search through the code yourself or ask one of your co-workers for some type of tutorial or to sort of walk you through, because unless there is self documenting code you’re kind of stuck.
Precisely. With our clients at API Fortress I’ve seen everything that runs the gamut, from almost 100% Swagger covered with incredible documentation and a repository for all the docs all the way down to kind of a half maintained Excel spreadsheet. So really you want to be closer to the first side of that than the second side of that with regard to documentation. But yes, when you’re first starting to test these APIs there is a very good chance that you’re gonna have to kind of touch base with the team that wrote them and then kind of find out where they’re keeping all their definitions and things like that and maybe have them talk you through it once, that’s often a great time saver for a tester in general.
How do you test for security when dealing with APIs but also can you broaden that to other types of test techniques not just security? So maybe address the security question first but then apply it to like load testing, or any number of other techniques that you can use.
From my perspective, I again work with API Fortress so I can tell you how we do things. Our principal philosophy on API testing in general is to again kind of emulate the live environment and chain together as many requisite requests and responses as possible. So building actual workflows and having them execute.
When we do load testing as well, we actually do load testing from kind of an integration testing perspective and that is to say when you create an integration test in API Fortress, our load testing client uses that whole integration test to execute all of the calls inside of the test sequentially tens of thousands of times and then generate a document that shows you where kind of the programmatic bottlenecks exist.
With regard to security testing, security testing is a fantastic topic which is worth more than its own kind of one hour talk, but from an API perspective I think the security issues that we see exist on two different levels. So one level is kind of the hardware level when we’re talking about SSL or TLS handshakes, things like that, raw encryption. And then the other level would be kind of dealing with what would be considered a more malicious user, so a hacker for example. Typically the security testing is handled by very specialized companies who do exactly that because security testing is oftentimes very difficult to automate because you need somebody who really knows what they’re doing to kind of try to accomplish things like cross-site scripting, SQL injection, things like that. From a load testing and a functional testing perspective though, I mean there are a ton of great options out there, API Fortress certainly is one of them, but philosophically I would say that I am definitely an adherent to what we do, which is comprehensive endpoint testing that involves multiple endpoints in sequence. And then from a security testing perspective, if it’s on a hardware level then good functional testing can accomplish that for you. If we’re talking more about resistance to malicious actors to you know kind of black hat kind of people then I would say working with an organization who specializes in white hat kind of penetration testing would probably be the safest avenue.
So there are several questions around mocking that I’ll try and batch into one. So do you have any good information, either books or websites, that you can recommend to people that want to begin with mocking APIs, just because it seems like it’s an often talked about but very rarely explained or exampled method for helping to test.
Sure. I mean I’m gonna be a shill for just a brief second and you can hate me for that…OK, but API Fortress provides a really fantastic, super simple to use, mocking platform and that’s rolled into the regular platform itself. It’s not kind of an add-on. Outside of that, there are a number of open-source frameworks that make it super easy, one that springs to mind is called Mockito and that’s built in Java I believe. And even beyond that, if you wanted to kind of generate mock responses without using a mock framework you could also just kind of stand up a lookalike server so to speak. Which is just generating… when you’re doing mocking you don’t have to go like full bore and generate … if you have one response that’s generated by each sequential API call, over time you can just create one mock that gives you the end product, if that’s all you need… Because we’re not testing the mocking server, we’re just testing that the data that we’re using is being processed by the real server correctly. So Mockito is definitely a good one. API Fortress is a fantastic option.
With regards to things to read… I would honestly recommend just like the documentation from Mockito for example is pretty well done documentation for mocking. The API Fortress is very well done. The only kind of hard and fast rule that I really recommend you stick to is that make sure that the schema that you’re using in that mock endpoint matches the schema of the end point that you were mocking and that may sound reductive but I’ve seen it more than once when somebody thinks they’re copying something over exactly but they missed a comma or something like that. And oftentimes, with the mock endpoint you don’t really have validation or anything like that, so you don’t know that the data that you’re using isn’t entirely correct so to speak.
I feel like that wasn’t my most helpful answer … I can aggregate some resources and you can probably socialize them with the group.
There are a few questions around resources for API testing so are there example APIs that people can test against that have like specific errors that you know of, like for target applications and that kind of thing?
APIs that have specific errors that I know of, … I actually don’t. There are a ton of public APIs that we use for testing or demonstrating all the time and at API Fortress we have a couple of servers that we set up specifically for testing purposes that just kind of deliver static data that can be tested against.
But APIs that we’ve used for demonstrations before that are kind of fun:
Uber has a fantastic public API. Google Maps has a tremendous public API with a ton of routing that you can do some really cool stuff with. Another one that I’d recommend is if you google Marvel API. If you’re a superhero fan at all there’s a pretty terrific API that returns kind of superhero data and you can send it a bunch of different types of queries.
I would also say check out GitHub. I know that Github is really great for lists of lists and I know I’ve seen a list of lists for public APIs. But Jason just gave a bunch of them as well.
And so I’m sort of going on the tools route. Are there popular libraries for APIs in Ruby and Java that you are aware of?
So server frameworks themselves. I am neither a Ruby nor a Java developer. That being said, our platform and our team, there are a lot of Java people on it. So Tomcat is a very popular one for Java. I believe Spring is one as well.
My knowledge of Ruby is very limited, but I want to say Sinatra is one of their server frameworks…but I’d be very hesitant to be quoted on that. I apologize. I’m neither an expert in Java nor Ruby…
Yeah. So I mean my HTTP client of choice, there’s one built in API Fortress, but if I’m not using that, if I just need to knock out a quick request, in my personal opinion, you really can’t go wrong with Postman. It’s a very lightweight tool, it’s very effective, it’s cross-platform.
One thing I would mention is that …I’m hearing language come up a lot in these questions… When a client is making a request to an API, that is typically language-agnostic. So it’s making the request as an HTTP request and it’s receiving the response as an HTTP response. The language that the underlying server is written in isn’t typically involved in any part of that transaction. That’s all what we refer to as business logic and that’s happening kind of underneath the API layer. So the language that the server is written in isn’t really coming into the equation unless you are a specialist in that language and you’re just looking for a recommendation on the server framework in which case I haven’t been the most helpful person in the world.
No, that’s a very good point. And so you sort of mentioned that a good lightweight tool and a really popular tool right now is Postman. And are there other lightweight tools as well?
If I’m looking for like just kind of open-source and free, Postman would be definitely the one that I recommend. If you start talking about the realm of paid tools, I mean obviously my mind would go straight to API Fortress but there are a litany of tools out there. But if I’m just looking for something lightweight and free I think I would be remiss in not recommending Postman. It’s a fantastic tool.
No. I agree. Fantastic. Like you said there’s a lot of language based questions which makes a lot of sense. People that are dealing with those languages.
So there’s the question that’s come up a few times and it’s about tools that are for transitioning from like manually testing APIs to automating APIs and so like API Fortress is a good example of something that automates them and so can you talk a little bit about going from that like exploratory, manual, but small amount of testing, to something more large and automated?
Sure. So like specifically with API Fortress if you were trying to kind of move from a more manual kind of click-based testing into an automated API testing tool, like API Fortress for example, makes it very simple by allowing you to generate kind of schema-validation level tests, meaning a test that kind of looks at a response and make sure it matches the expected response. API Fortress for example does that automatically and then allows you to take that test and schedule it with kind of a graphical user interface based calendar and then it takes the data from that calendar and feeds it into a number of dashboards and then on top of that the platform itself makes all of that data available via API so you can pipe that data into other platforms. So transitioning between manual based testing to an automated testing platform like API Fortress can be difficult but the benefits a lot of the time really outweigh the downfalls because you can also collaborate really heavily with your team. You know with Postman, or with a lot of the manual testing tools, it’s very difficult to get test data from user to user or from user to management without leveraging something like a Git repository. With automated platforms like API Fortress, you have one platform that everyone is using simultaneously, and that really makes collaboration a whole lot easier. The team can kind of work together and the sum of the parts is greater than the individual parts themselves. It’s really, really cool. We’ve seen a lot of success with it. But when transitioning from kind of a manual based tool to a more of an automated platform, as with all things I mean I just say it’s unproductive but you know make sure you read the docs, ask for help when you need help and over time it gets easier. And then all of a sudden you find yourself doing things that you didn’t think you were capable of prior to that.