โ€” Oct 30, 2020 ยท 17 Min read

Give this post a share? ๐Ÿ™

What the heck is an API? (with AI text bot example)

If you read this post start to finish, you're going to walk away knowing:

  1. What this terribly overused buzzword "API" actually means
  2. How we can use an API to easily do this:

reading image text

This post is written for a complete beginner at programming, but if you're here because your CTO won't shut up about "integrating APIs for enhanced performance and cost savings", I think you might find the next few minutes helpful too.

A vague, unhelpful definition of an API

The three letters "API" stand for "Application Programming Interface". Here's the type of definition you might be used to seeing (and not understanding):

An API is a software intermediary that allows two applications to talk to each other

This makes sense to me now, but when I wrote my first line of code 4 years ago, this type of definition was vague and unhelpful. We're going to need something better than this.

An analogy to help us

An API is like your car's owners manual.

Let's say that one night, you turn on your car and your front-left headlight is out. Bummer. After a few minutes of huffing and puffing about your misfortune, you realize that you're going to need to fix this soon to avoid getting a ticket. But which light bulb do you need to buy? How do you actually fix it?

You have two options to figure these questions out:

  1. Open up your car manual and read it
  2. Ask the rep at your local auto parts store for help

For the sake of this post, we're going to start acting like true software engineers and read the manual (RTFM).

If I open up my car's manual, I get a nice little picture of all the lights on the car:

car lights

I see that the bulb I'm trying to fix is #2, and if I scroll to the next page of the manual, it tells me the model of light bulb that I need.

bulb list

And true to almost everything as a software engineer, the answer usually spurs additional questions. In this case, do we need HID or halogen lights? Since this is not a vehicle maintenance tutorial, let's not get too deep here. We will assume (and I will confirm since I changed my front-left headlight a few weeks ago) that we need halogen lights. For halogen lights, our manual has instructed us to purchase 12 Volt, 55 Watt bulbs with a product number of H11. A quick little search on AutoZone's website and we can find what we need:

bulb product

Okay, we're almost done talking about cars

Again, this isn't a car maintenance tutorial, but our analogy has some important implications for talking about APIs.

Our car manual isn't exciting, but the fact that it exists is very important. To highlight the importance of having a car manual, imagine the following fictional scenario:

  • 10,000 Toyota Corollas are manufactured in 2015
  • 2,500 of these can only use 12V-65W LED headlights
  • 2,500 of these can only use 12V-35W HID headlights
  • 2,500 of these can only use 12V-55W Halogen headlights
  • 2,500 of these can use any of the bulb types

Could you imagine the nightmare this would cause? A car mechanic would have to manually open up each 2015 Toyota Corolla to see what kind of light bulb it required!

Thankfully, car manufacturers understand the concept of "APIs" and have defined specific parts that are compatible with each car make/model so that owners, mechanics, and many others can communicate in a standardized "language".

A Car Manual is an API

While "application programming interface" doesn't fit our analogy perfectly, the concept of it does. A car manual demonstrates the core concept of an API, and with that come some major benefits:

  • By defining an "API", someone who knows little about the internals of a car can figure out how to replace a headlight. This person does not need to take apart the car to figure this out, does not need to "jerry rig" something together, and most importantly, is able to benefit from the complex engineering/design of the car without interacting with the details.
  • By defining an "API", the manufacturers of the car do not need to sell every replacement part for that car. Independent parts manufacturers can read the "API", figure out the specifications of the headlight bulb, manufacture it, and sell it. As the car owner, you usually have a couple different lightbulb brands to choose from that are selling the exact same part.

Furthermore, through this analogy, we can talk about the main properties of an "API":

  1. An API is simple - Nobody wants to own a car where it takes 20 steps to replace a headlight.
  2. An API is predictable - If the manual says that a light bulb will work, it better work!
  3. An API is backwards-compatible - If a 2015 Toyota Corolla requires a 12V-55W Halogen low beam headlight when it is released, it should require the same light forever. If the manufacturer wants to change the light required, they will release a new "version" of the car (i.e. the 2016 Toyota Corolla).
  4. An API (usually) follows standards - Nobody wants to buy a car that uses a different light bulb than every other car on the market. Not only will this drive the price of the bulb up, it will be hard to find a place that sells it.
  5. An API is well documented - While a mechanic is much more likely to read a car manual than the owner of that car, it is really important that the manual describes the required parts and process of replacing a headlight. If this is not documented, the "API" is more likely to break since you will have owners and mechanics guessing at how they should replace the light bulb.

While I've described these properties from the perspective of a car, you'll soon see that they apply to software APIs as well. Software APIs also have properties that a car "API" doesn't such as security, scalable, atomic, and a few others that we don't need to address in this post.

Software APIs can be many things

While our car "API" communicates the concepts, there are many nuances to software APIs. For the remainder of this post, I am going to transition us from the basic car example to more refined understanding of application programming interfaces. While we do this, keep our car analogy fresh in your head. Here are some of the things we need to cover:

  1. APIs can be discussed at different levels of abstraction
  2. APIs can follow various standards (SOAP, REST, RPC, etc.)
  3. APIs can be classified by many attributes (open, closed, composite, partner, etc.)

The three points above make it extremely difficult (if not impossible) to boil this concept down into a single definition. Furthermore, as someone new to software engineering, many of the details within each point won't and shouldn't matter to you yet.

Below is my general definition of a software API that we can come back to, but take it with a grain of salt. Like discussed above, APIs are not created equal!

A software API can be thought of as a "user manual" that allows developers to do something useful with 3rd party software/hardware without knowing the inner-workings of that software/hardware.

The above definition is how I think about APIs. I think every developer has a slightly different understanding of and definition for an API, and you will eventually reach that point too.

APIs and endless levels of abstraction

This concept doesn't solely apply to APIs. The more you write software, the more you'll recognize how often your brain "abstracts" concepts.

If we stay on-theme and explain this through a car analogy, we can think of the automatic transmission of most cars these days.

gears in car

Tell me this... How much brainpower is required to put your car in "drive" mode?

Very little. Anyone who has driven a car for some time can have a full conversation with a passenger while shifting gears. As a driver, you don't need to know what is happening under the hood of your car when you shift gears--you just have to shift gears! This is because the act of shifting gears is "abstracted" away from the driver. If you wanted, you could follow the "chain of abstraction" all the way down to the finest of details required to build a car. While the driver of the car does not need to understand automatic transmission, the mechanic does. Furthermore, the mechanic might need to understand how automatic transmission works, but doesn't need to know how the metals used for the car parts are extracted from the earth. And taking it even further, the person extracting metals from the earth probably doesn't need a PhD in geology.

Just like we can talk about a car's automatic transmission at different levels of abstraction, we can talk about APIs at different levels of abstraction. Here are some common levels of abstraction that you might hear someone referring to an API at (from highest to lowest):

  1. "Our company uses APIs to connect our backend infrastructure" (highest) - At the highest level of abstraction, we can talk about APIs as "technologies" that allow various apps and infrastructure to communicate. In today's world, a corporation might operate 50 different apps, but without APIs, they cannot share data and create a cohesive software ecosystem.
  2. "My productivity app uses the Google Calendar API to automatically schedule events based on user tasks" - In this case, Google runs a calendar application that has an API which other developers can use to programmatically use it.
  3. "I need to read through the API of this code library before I can use it effectively" - Let's say that I wanted to use the Express JS framework to build a backend process for my web app. This framework is meant to be used in a very specific way, which is documented through their "API Reference" here.
  4. "This function has a very specific API defined by its arguments and return values" - Throwing a little code at you here:
function computeMortgagePayment(
  interestRate: number,
  price: number,
  term: number
): number {
  // No need for us to know the implementation details because we have an API!
}

If you were to actually write a mortgage calculator function, you would need several lines of code. But if you are just using the function, you shouldn't care how it is implemented. All you should care about are the inputs and the outputs defined by the function's API. This function requires you to define an interest rate, house price, and loan term, which all have a data type of number. This function returns a number that represents your monthly mortgage payment. These input and output definitions represent the function's API.

  1. "The hardware on my computer has a specific API that all code languages must be compatible with" (lowest) - That code from above needs to be converted to 1s and 0s to actually run on the device you are using to read this. Each computer (hardware device) has an API that defines how low-level operating system functions should be called. We won't go into this as it gets very complex very fast.

As you can see, two people saying the word "API" may not be talking about the same thing, and to add to the confusion, APIs can live inside other APIs. When this inevitably frustrates you, just remember our general definition of APIs.

A software API can be thought of as a "user manual" that allows developers to do something useful with 3rd party software/hardware without knowing the inner-workings of that software/hardware.

API Standards

We are not going to visit this topic for long, but as you begin developing software, you will surely hear other developers talking about RESTful APIs, and in some cases, SOAP APIs and RPC apis.

When you hear someone talk about a REST (REpresentational State Transfer) API, they are simply referring to a standardized way of creating an API.

Think of these as "industry standards". Going back to our car and light bulb example, if all car manufacturers required completely unique light bulbs, there wouldn't be enough parts suppliers to produce all the different lightbulbs. By having a few standard light bulbs that many different cars use, the entire industry is more efficient. With standardized ways to implement APIs such as REST, the software industry can operate more efficiently because developers like you and me only have to learn a few of different standards before we can start working with various APIs such as the Google Calendar API (uses the RESTful APIs).

The many classifications of an API

As you're learning here, the acronym "API" is a loaded topic, and because of that, it's not a surprise that we have a hard time understanding it. On top of the standards an API can follow and the level of abstraction that you are speaking from, we can also group different APIs into various categories. Here are 3 relatively common and easy to understand classifications, but are certainly not representative of everything you might find.

  1. Open APIs (e.g. Google Calendar API)
  2. Proprietary APIs (e.g. Salesforce API)
  3. Internal APIs (e.g. an engineer writes a custom API to connect a corporation's many software systems)

Rather than describing them, let's ask why someone would create each of these.

Why create an Open API?

Let's say that you're the CEO of Google for the day and you have to decide whether the Google Calendar API should be open or proprietary. If you make it proprietary, the user will get the experience that your company can provide and that's it. If you make it open/public, developers across the globe can integrate their products with Google Calendar.

For example, a productivity app might want to show its user their weekly schedule along with their todo list and their goals. If Google has an open API while Outlook does not, this productivity app will most likely integrate their product with Google. And when someone who uses the productivity app sees this integration, they are more likely to continue using or start using the Google Calendar service. It's a win-win for everyone, and in a sense, Google has outsourced developers for free!

Speaking of Open APIs, time for some textual analysis AI

We can talk about ways to classify APIs all day, but the best way to learn what an API does is to see a real example. What I am about to show you is just one example of an API, but it demonstrates the concepts we have talked about well.

If you remember from earlier, an API allows a developer to do something useful with 3rd party software without knowing how that software is implemented. A person driving a car does not need to know how the car's automatic transmission works to park the car.

In this case, we are going to use the Google Vision API, which is composed of thousands of lines of code that we never need to read through! We don't need to be Machine Learning experts--we just need to learn how to use the API. And when we're done, we will be able to recognize text from images:

reading image text

Overview: What is the Vision API?

The Google Vision API is an open-source project maintained by Google that allows you to do things like detect faces in a crowd, detect landmarks, read text from an image, detect logos in an image, and much more. If you're completely new to software engineering, the Vision project is an example of Machine Learning and Artificial Intelligence; a similar concept to how Alexa works in your home.

I'm not going to show you a bunch of code or detailed steps that I took to get the Vision API working I went through to get this working, but will highlight the main actions that I took to get the final product.

Step 1: Read the docs

When starting out with any new API, the first thing that you will always do is check out their documentation. Remember how I said that a property of an API is good documentation? Well this is where that property comes into play. If the documentation for that API is good, you should be able to get up and running within a few minutes. The Google Vision API has a wonderful quickstart guide that I used to make my first API request within 20 minutes.

Step 2: Decide how you are going to integrate the API into your project

In my case, I wanted to create a simple web app like you saw in the GIF above. To do this, I used the following tools, which I already knew how to use. To use this API, I could have used several different coding languages and tools, but these just happen to be the ones that are most familiar to me.

  1. Express Framework (for the backend)
  2. Angular Framework (for the frontend)
  3. Angular Material Library (for styling)

I scaffolded out a new Angular project with the ng new command line tool, and used the Google Vision example along with the Reference docs to create the following function (don't worry about how this works if you're new to software):

async function analyzeText(url) {
  // Imports the Google Cloud client library
  const vision = require("@google-cloud/vision");

  // Creates a client
  const client = new vision.ImageAnnotatorClient();
  const request = {
    image: {
      source: { imageUri: url },
    },
  };

  // Performs textual recognition analysis on the image file
  try {
    const result = await client.textDetection(request);
    return result[0];
  } catch (err) {
    console.log(err);
    return null;
  }
}

Believe it or not, this is the only code that I needed to write to get the Google Vision API working! All the other code that I wrote (here is my Github repo) was simply to get the Angular application running, talking to my backend server, and styling the app!

Step 3: Add functionality

Although I kept this example simple, you could add tons of other features to your web app using the Google Vision API. After 1,000 API requests, Google will start charging you money, but that is not always the case with APIs! There are plenty of free APIs with no usage limits.

Big Picture Review

While this post has covered many different topics and even provided a real-world example of an API, just remember that an API is more of an idea than a specific thing. As you learn to code, keep this in mind and in due time, you will have a much better grasp of what an API is, and how it applies to your specific role.

If you liked this post and want more...

A few extras to keep your journey moving: