Partager

15 avril 2021

Anatomy of a Go app with a clean architecture

At Kumojin, we can adapt to any technology our clients use or want to use. However, we have to help them choose the right one when they are building their product from scratch.

Golang is often the technology of choice for the backend part of the infrastructure, and most notably when implementing REST APIs.
We could write a whole post regarding the advantages and drawbacks of Go, but for the former, we think that simplicity, readability, and developer productivity are big factors.

We wrote several backend apps for our clients and each time we learned something new and improved upon our previous iteration in our attempts at implementing a clean architecture pattern. The apps were mostly implementations of REST APIs, but we also had some additional tooling and once a GRPC API.

We came up with a layered architecture and loosely coupled components that makes sense for us. One important thing though: keep in mind that we're talking about Golang apps that grow in size, think REST APIs with dozens, sometimes hundreds of endpoints. For simpler cases, that structure might be overkill, and you might as well just do what you have to do in your main.go file! Always keep the KISS principle at the center of your developer life!

Also, we will not be talking about any library we use here as it is not relevant in terms of structuring the Golang app. But it might be the subject of a separate post. Given the rich Golang ecosystem, there are plenty of excellent choices to explore.

What is a clean architecture?

No better way to explain this than this nice Wikipedia article.

The main goal of clean architecture is to have highly decoupled application components and get the most maintainable possible code, whereas changing a line of code somewhere does not break everything else (we all want that, right?).

As for the vocabulary, that is "storage", "repositories", "models", "use cases", "ports", we'll explain them all in the next paragraphs.

The structure of a Go app

We're going to study the structure of a simple Go app talking to a database and exposing a web server and a GRPC server. This backend exposes business processes for registering a new user and logging in, and we're now adding a new one for updating a user's avatar picture.

- cmd
  - main.go
- config
- db
  - cmd
    - main.go
  - migrations
  migrate.go
- pkg
  - context
  - models
    - user.go
  - ports
    - http
      user.go
  - storage
    - db
      user.go
    repositories.go
  - usecases
    - user
      login.go
      register.go
      update_avatar.go
 - grpc
   - cmd
     - main.gp
   server.go
 - rest
   - cmd
     - main.go
   server.go
main.go

It might be overwhelming at first, but don't worry. In the next paragraphs, we'll explain why we structure it like that, what is in each folder, and why the multiple cmd packages are about.

The entry point of a Go app

First of all, our root directory has only one file, main.go, the entry point of our Go app. You want to keep this as short as possible. Most of the time, the first thing you would expect your app to do is to read its configuration, whether it is the webserver port, database location, static files locations, etc.

In our case, we will delegate these operations to our main command parser in cmd/main.go.

Parsing the command

The main parser

The cmd/main.go is responsible for two things:

  • locating the configuration file and reading it;
  • delegating reading-specific configuration to the proper subcommand parser.

So let's say our Go app accepts a configuration file. We're going to check if we have the default one, or if we have to read the one provided in the command (--config=config.local.yml). And then we'll save the info in our configuration (the config package). Nothing fancy here, we expect it to be a bunch of struct available globally.

Then, we're going to look at what command we want to execute.
We see that the subcommand that needs to be executed is web, which means we want web/cmd/main.go to take over.

The subcommands parser

In our app structure, we can see multiple subcommand-related packages such as db/cmd, grpc/cmd and rest/cmd.

The grpc/cmd and rest/cmd subcommands will read their specific configuration and respectively start the GRPC or web server.

You can have as many subcommands as you want supporting other protocols, and/or tooling. The important part is to have each subcommand being responsible for reading its configuration.

You might wonder what the db/cmd does… Do we spin up a database? Nope, no worries!

We work with up/down migration files to always have our database structure compatible with the application code we have, and be able to roll back if necessary. Think Liquibase or Flyway, but simpler.

So db/cmd will allow us to migrate up or down our database using migration files stored in the db/migration folder.

Example:

# the following command expects the database to be migrated to version 34
go run . db migrate up 34

Running a subcommand

When the command line is fully parsed and the configuration available, the last task of the invoked subcommand is to run the expected piece of code:

  • db/migrate.go will perform database migrations as previously mentioned;
  • grpc/server.go will start the GRPC server;
  • rest/server.go will start our REST API server.

The migration code is a bit of an exception as it does not need any business domain knowledge. The other commands will most likely interact with the business domain… So how do they do that?

Application code/business layer

Our business code is located in the pkg package. It has a bunch of packages that we're going to get through.

The models

pkg/models contains our models, duh!

For example, we expect user.go to at least contain a User struct and/or a NewUser struct for creating a new user and performing registration.

Something like that:

type User struct {
	ID        string `db:"id" json:"id"`
	Email     string `db:"email" json:"email"`
	FirstName string `db:"first_name" json:"firstName"`
	LastName  string `db:"last_name" json:"lastName"`
	AvatarURL string `db:"avatar_url" json:"avatarURL"`
}

The repositories

pkg/storage contains our repositories, the implementation of our communication with the storage layer. Storage is a generic word as it may be a database, like we have here, but also an FTP server, a cloud service, and so on.

Regardless of our type of storage, pkg/storage/repositories.go contains a list of interfaces for reaching our data. Using interfaces defining sets of methods instead of structures here is crucial as we do not want any implementation to surface outside of the storage package.
It shouldn't matter if we store our users in a database or an FTP server (well, hopefully, it's not the latter…).

In our example, our business data is stored in a database, so we implement our User interface in pkg/storage/user.go where we expect to find the usual CRUD methods for searching/getting/saving our users.

To better illustrate our previous point, let's say each user can have an avatar picture, and we decided to store user avatars in Amazon S3. We expect to have a proper interface defined in repositories.go for that.
And the implementation of our repository for saving a user avatar would be in the pkg/storage/s3 package for example.

Here are the interfaces we should see in repositories.go:

// UserRepository is for interacting with users
type UserRepository interface {
	FindByID(userID string) (*models.User, error)
	Create(user *models.User) error
	Update(user *models.User) error
}

// UserAvatarRepository is for interacting with user avatars (blobs)
type UserAvatarRepository interface {
	CreateOrUpdate(userID string, blob io.Reader) error
}

Now, you might start to wonder: "If saving a user avatar requires us to save a blob in S3 and updating the avatar URL in the user table, someone needs to sequentially call the two different repositories, right?"

Absolutely! we have the use cases for that.

The use cases

pkg/usecases contains our business logic situated on top of the storage layer. At this point, whether the request comes from a REST endpoint, a GRPC call, or anywhere else, does not matter.

Now, whether it is "creating a new user", or "updating a user's avatar", each business operation/process should have its own use case, and each use case is going to control one or more repositories to do the job.

As we've seen above, our "create a user avatar" use case will require two repositories, one for physically storing the avatar blob somewhere, and one for saving that physical location into our user information. Notice that in the previous sentence I avoided using the "S3" and "database" words.
It's because we don't care: it's the responsibility of the repositories to communicate with the storage. At the use case level we just know that they are able to support our use case logic.

Here's what we can expect:

package user 

// What the higher level components should see 
type UpdateUserAvatarUsecase interface {
    Update(userID UUID, blob io.Reader) error
}

// The struct that will implement the use case, and the implementation
// will be based off the two repositories previously mentioned. 
// One for uploading the blob, another for updating the user data.
type defaultUserAvatarUsecase struct {
	user       storage.UserRepository
	userAvatar storage.UserAvatarRepository
}

func NewDefaultUserAvatarUsecase(
	user storage.UserRepository,
	userAvatar storage.UserAvatarRepository,
) usecases.UserAvatarUsecase {
	return defaultUserAvatarUsecase{
		user:       user,
		userAvatar: userAvatar,
	}
}
func (u defaultUserAvatarUsecase) CreateOrUpdate(userID string, blob io.Reader) error {
	// First we upload the blob and get the public URL
	avatarURL, err := u.userAvatar.CreateOrUpdate(userID, blob)
	if err != nil {
		return err
	}

	// Then we find the user
	user, err := u.user.FindByID(userID)
	if err != nil {
		return err
	}

	// ...and update the info
	user.AvatarURL = *avatarURL
	err = u.user.Update(*user)
	if err != nil {
		return err
	}

	return nil
}

The ports

The ports contain the implementation of our "endpoints" reachable from outside of the application.

For example, we might allow the user avatar to be uploaded with an HTTP call through the REST API, or with a GRPC call. If we didn't have the use case layer, we might have had to duplicate the use case logic in both implementations. But we do have it, so each endpoint would just need to call the use case to operate.

Is there more at that level? Yes! Each endpoint is based on a given protocol, for example HTTP for the REST API. So we expect our HTTP user avatar endpoint to perform specific checks:

  • validating the blob has the right mime type, maybe that the image does not exceed a certain size;
  • validating the JSON data if we have one;
  • returning the right HTTP status code depending on the result of the operations, etc.

Here is what we expect for an HTTP port allowing a client to update a user's avatar:

package user

// Our HTTP port for updating a user avatar. Notice the `echo`
// package, it's because we use the echo library built on top of
// net/http. You may use any library you want, including the base
// net/http one.

type UserAvatarPort interface {
	CreateOrUpdate(ctx echo.Context) error
}

type defaultUserAvatarPort struct {
	userAvatar usecases.UserAvatarUsecase
}

// NewDefaultUserAvatarPort returns the default implementation of the user avatar port.
func NewDefaultUserAvatarPort(userAvatar usecases.UserAvatarUsecase) UserAvatarPort {
	return defaultUserAvatarPort{
		userAvatar: userAvatar,
	}
}

func (p defaultUserAvatarPort) CreateOrUpdate(ctx echo.Context) error {
	// We get the id of the user from the URL placeholder
	// Say: PUT /users/{userID}/avatar
	userID := ctx.Param("userID")

	// We'll try to get the image part of the multipart request
	formFile, err := ctx.FormFile("image")
	if err != nil {
		return ctx.String(http.StatusBadRequest, fmt.Sprintf("No image found in multipart request: %s", err.Error()))
	}
	file, err := formFile.Open()
	if err != nil {
		return err
	}
	defer file.Close()

	// We have everything, we can call the usecase to do the operation.
	err = p.userAvatar.CreateOrUpdate(userID, file)
	if err != nil {
		return ctx.String(http.StatusInternalServerError, err.Error())
	}

	return ctx.NoContent(http.StatusNoContent)
}

Now you might wonder: "OK, we have repositories, use cases, and ports… But who does the plumbing and connects everything?".

The application context

Borrowing a term that is often used in the Java/Spring world, we have an "application context" for plumbing everything together, though it's just going to be a bunch of methods rather than an object/global structure.

The pkg/context package is responsible for providing:

  • the storage/database connection to each repository;
  • the required repositories to each use case;
  • the use case to each port.

We applied the dependency inversion principle all along, and here we are going to inject the required dependencies into every piece of code that needs them.

And to do that, we won't use any fancy library, we'll just expose regular methods. So, whether it is a repository, a use case, or a port, it's the place where we can get them.

We usually separate the constructor functions into separate files for repositories, use cases, and ports, but for the sake of the example, here is what we should find in the context package:

// We use sqlx as our helper library for SQL database stuff,
// you can use anything you want. The goal here is to connect
// things, and ultimately one of the repositories used by our
// use case needs the database. 

// NewUserUsecase returns a new instance of the user use case
func NewUserUsecase(database *sqlx.DB) usecases.UserUsecase {
	return user.NewDefaultUserUsecase(
		db.NewUserRepository(database),
	)
}

// NewUserAvatarUsecase returns a new instance of the user avatar use case
func NewUserAvatarUsecase(database *sqlx.DB) usecases.UserAvatarUsecase {
	return user.NewDefaultUserAvatarUsecase(
		db.NewUserRepository(database),
		s3.NewUserAvatarRepository(),
	)
}

// NewUserPort returns a new HTTP user port
func NewUserPort(database *sqlx.DB) http.UserPort {
	return http.NewDefaultUserPort(
		NewUserUsecase(database),
	)
}

// NewUserAvatarPort returns a new HTTP user avatar port
func NewUserAvatarPort(database *sqlx.DB) http.UserAvatarPort {
	return http.NewDefaultUserAvatarPort(
		NewUserAvatarUsecase(database),
	)
}

Conclusion

We've just seen how to structure our Go applications to have reusable business logic, allowing us to separate concerns and make that logic available through different endpoints/ports.

There are countless ways one can implement a clean architecture in a Go app, this is just one of them. And as mentioned in the introduction, we are ourselves constantly improving upon our previous iteration, tweaking things here and there to have the most maintainable and readable code possible, while staying as simple as possible.

Nonetheless, we hope that our documented attempt(s) will help you find your own "right" way to implement that architectural pattern!

All the pieces of code we refer to in this post, with the "glue" between them, can be found on GitHub at https://github.com/LukaszKokot/go-clean-template.

Lukasz Kokot

Lukasz Kokot