A new alternative to parallel file systems or software defined storage for archiving is Quobyte. Here we will outline current design highlights to build a high performance cluster utilizing Quobyte.

Why Quobyte?

The value in Quobyte is that it’s a storage system of a new generation.  Quobyte offers a solution with unified storage, simplified operations, and a complete set of management tools. It’s a software-defined storage solution that separates hardware from software.  Quobyte makes data independent from the individual storage devices.  Decoupling of logical and physical levels simplifies the management of infrastructure and enables the usage of hardware resources in a much more flexible way. It is virtualization-ready and has ultimate scalability on a per disk/per node with as dense of a footprint as you can get from the latest commodity servers.

As it is a full-featured POSIX file system, all UNIX applications can share the same storage – databases (like MySQL), email (like dovecot), VMs, etc.  Interfaces for Hadoop, S3, NFS, and OpenStack are included.  It supports parallel, sequential and small block random I/O workloads with near-hardware performance which keeps virtualization costs low.  Because all data is available on any server or VM (shared storage), no extra NFS or CIFS servers are needed.  Management and reconfiguration run transparently in the background.

Operations are simplified as Quobyte can manage all storage devices with one system.  All applications share the same resources which enables oversubscription, dynamic reassignment and improves utilization.  It decouples storage from host hardware and devices, with full split brain-safe fault tolerance, fast automatic failover and end-to-end checksums.  Virtualization features yield low-touch operations because individual hardware failures are not relevant for the overall operation.  If needed, broken servers and hard disks can replaced when convenient (even days and weeks later, given enough spare capacity) and there’s fast parallel automatic regeneration of any most or corrupt replica.  There are simply no strings attached: switch off any server at any time! Because Quobyte runs on shared Linux servers, there are no special hardware or hardware redundancy features required and no RAID setup as plain formatted disks are used.

Along with the system comes a complete set of management tools:

  • Full system scrub that verifies checksums
  • Background rebalancer
  • Backup interface that integrates with any file-based backup solution
  • Convenient access via web console
  • Integrated monitoring and alerting
  • Fully programmable via JSON API and command line tools

To scale-out Quobyte appliances, you can mix and match nodes.  For performance: 2 GB/s, NVMe and SSN disk options with NVMe/SSD for metadata.  For general purpose: SAS disk options with SSD for metadata.  For Archiving: NL-SAS disk options in dense chassis.

Here is an example of the underlying hardware architecture – hybrid:

Here is an example of the underlying hardware architecture – All-flash:

DST has provided expertise, problem resolution and architecture design for some of the largest private and public HPC clusters in the world.  Need more insight?  We can help sort out your options in parallel file systems or software defined storage.  Contact DST at dst@datainscience.com.

Previous article