5. File Storage

🔙

Date: 2025-01-24

Status

Proposed

Context

It’s required to store files reliably and access them in a highly concurrent microservice environment.

Even if the project is deployed in the cloud, it’s desirable to be independent of the cloud provider - for example, to be able to migrate to another cloud if it becomes reasonable at some point.

Decision

MinIO.

Consequences

Robust cloud-style file storage, even if the project is not cloud-based.

Options

MinIO

🔝

This is a popular solution.

Pros and Cons

Pros
  • Supports replication.
  • Supports data encryption at rest.
  • If the project is hosted in the cloud, it’s worth mentioning that, unlike native solutions, MinIO is free if self-hosted.
  • Works with the same REST API as AWS S3 - this allows using the native S3 Java client and reduces the learning curve.
Cons
  • REST API adds latency compared to direct file access (same as S3, but in AWS cloud everyone uses S3 and nobody complains 🤷).
  • According to some reviews, under high load this system, if running as a Docker container, starts to slow down significantly. In such cases, it needs to be run without Docker on a separate VM.

Docker volume

🔝

A disk can be mounted to all Docker containers.

Pros and Cons

Pros
  • Maximum file access speed, which may be necessary for mission-critical tasks.
Cons
  • In a microservice environment, there are risks associated with simultaneous reading and writing of the same file (this is not a database, there are no locks).

Cloud Provider Service

🔝

If the project is deployed in the cloud, a native solution is available, such as AWS S3.

Pros and Cons

Pros
  • Easy integration with other cloud resources.
Cons
  • Usage fees - either immediately or when exceeding a certain free threshold.
  • Project dependency on the cloud provider (vendor lock-in). This can be partially mitigated in the source code by creating an abstraction (Java interface) «File Storage», which will allow for multiple implementations. This approach should be preferred for all cloud-native services, as the project may face the need to move to another cloud in the future. Also, such code is more reusable and easier to test.