API Design: Keep it Small and Focused

Implement the 90-10% Cases
Avoid "Commons" APIs
- Use a Utility Web Page Instead

Jakob Jenkov
Last update: 2014-05-25

A small, focused API is almost always preferable over a big API addressing all kinds of problems. Smaller API size means smaller memory footprint and often shorter build times.

Smaller API's are often also easier to learn. There is nothing more frustrating than to spend a long time studying a certain feature in an API, only to discover that the feature does not work in your special case.

Therefore, for every feature you want to add to your API you should give considerate thought to whether that feature actually belongs in your API.

It is easy to feel tempted to add features to your API once your users send you emails with all kinds of suggestions. You should resist the temptation to implement a suggestion right away, unless you know for certain that the suggestion lies perfectly within the core problem domain addressed by your API.

Almost any API could have lots of little nice-to-have features added. Some of these features make the API easier to use, and are thus justified. Other features may seem like a good idea at first, but aren't really core features, or they only apply to a limited set of the total use cases within the domain they address. These should perhaps be left out. They may end up cluttering your API more than they improve it.

For instance, in Butterfly Persistence I once considered adding automatic transaction ID generation. Transaction ID's are useful when logging, to see what actions logged that belong to the same transaction. I haven't implemented this feature yet though, for the following reason:

The user of Butterfly Persistence may want his transaction ID's to consist of more information than the API has available. Information like Server-ID, HTTP-session ID, User ID etc. Since Butterfly Persistence does not know whether it is being used in a desktop app, a standalone command line server app, or a web app, how can it include information like that in it's transaction ID's? All it can really include is stuff like machine IP address and time.

Of course I could get around the lack of that information by allowing the user of Butterfly Persistence to plug in an ITransactionIDGenerator implementation. But in most cases it will actually be easier for the user to just implement that code outside of Butterfly Persistence, rather than having to worry about how to give such an ITransactionIDGenerator access to server ID's, user ID's, session ID's etc.

Implement the 90-10% Cases

Determining what features are core features and which are not isn't an exact science. So, how do you judge which features should be implemented and which should be left out?

If a feature will be used by 90% of the users, or 90% of the time your API's is used, it is most likely a core feature and should be added.

If on the other hand a feature is only used by 10% (or less) of the API users, or only in 10% (or less) of the use cases, it is not a core feature, and should probably be left out. It is probably better to just show a code example of how the user can implement this herself, outside the API.

Yet, if the feature can be implemented in a way that it will not bother the users that arent't using it, perhaps you could still implement it.

For instance, Butterfly Persistence is a database API. Some users may need to import data from files into a database. Therefore, Butterfly Persistence could potentially have data import / export features built into it. But, almost any database has these features too, so would anyone actually use those features? They are perhaps better left out. Instead I could show a code example in the documentation of how to import / export data, which users of Butterfly Persistence can copy-paste and adjust to their needs.

In contrast, in Butterfly Persistence I recently added the ability to create and upgrade databases. Again, most users will most likely not use this feature. They will create and upgrade the database outside of their application. But, for those who really need it this feature is really handy. For instance, if you are developing a desktop application which uses a local database, you cannot expect the user of that application to create or upgrade the database when upgrading to a new version of the desktop application. In this situation it can be really handy to let the application do the upgrade.

Should this feature really have been left out? Maybe. Like I said, determining what features are core features isn't an exact science. Creating and updating databases is not a core feature, but it is still very useful.

The create and upgrade feature was implemented in only 1 interface and 5 classes, meaning it is not really adding too much clutter to the API. Non-users can almost not even see these features from the surface of the API, so non-users are almost not bothered at all by these features. Therefore I chose to go ahead and add the features anyways, though I did think about for a long time before I did so.

I've since found myself always choosing to use this way of creating and upgrading databases over direct access to the database. Especially if my applications are working with embedded databases, where only one process can access the database at a time. Instead of having to shut down my application to reset the database via a separate application, I just have a component inside my application do the job instead.

Avoid "Commons" APIs

Almost every application I have built in my career has had a "utility" class or package. This utility package contains small methods that are used from different components inside the application. Often utility methods are so small that they are just gathered inside the same utility class as static methods.

It is tempting to try to isolate these utility methods and classes into a general purpose "Utility Library", so they can be reused from application to application. A bit like the Apache Commons library. But, I will advice you to think twice about doing that.

Such utility libraries have a tendency to end up as "garbage cans" of all kinds of unrelated utility methods. As the utility library grows, your applications each tends to only use a small percentage of the total utility methods. This means that each application is dragging around lots of dead code. Dead code means unnecessary prolongiation of build times. Since unused classes aren't loaded into memory, you may not suffer from a larger memory footprint because of them. But, as soon as you use just a single method from a class with static utility methods, the whole class is loaded into memory, and you'll get the memory footprint of all methods, not just the methods used.

Use a Utility Web Page Instead

Instead of creating utility libraries that end up as big garbage cans, list these utilities on a web page of your own somewhere. Whenever you need any of these small methods you can easily copy-paste it into a utility class in your application.

By listing utility classes and methods on a web page you can still reuse the code from project to project. You can even modify the utility code inside each project without affecting other projects. And, you only add the methods to your project that are actually used.

Next: API Design: Don't Expose More than Necessary

Tweet
	Jakob Jenkov