Ziplip Scalable Enterprise Email Archiving
ZL Email Archive Resources

About ZL Unified Archiving Suite
The ZL Advantage
Archive for Storage Management
Archive for Compliance / Legal
Archive for Email Discovery
Email System Migration
Grid Architecture
Deployment Options
Replacing Legacy Systems
Product FAQ
Case Studies / Whitepapers
Datasheets
Request a Trial
Request a Quote

Related Products

Assureon File Archiving
SATA Beast (42TB SAN)
MailMarshal SMTP

Further Assistance
Request a Call Back
Request a 1 to 1 Live Demo
Register for a Webinar
Quick Enquiry Form
Discussion Forums

ZL's Unique Grid Architecture


Grid Computing and the Effects on Archive Scalability

Today's market is filled with a variety of email compliance, archival, storage management, knowledge management, and retention management solutions, some old, some new. All claim a wide range of functionalities for email, instant messaging, Bloomberg, files, and other data.

However, to be a successful enterprise email management solution in today's market, a solution must not only claim comprehensive capabilities but also deliver on several key criteria:

  • Scalability
  • Flexibility
  • Integration.

The first and most difficult of those three criteria is Scalability.

The claim:  "SCALABLE? OF COURSE WE ARE SCALABLE."

If one were to accept the claims of most solutions in the marketplace, each one is "highly scalable" and have no problem meeting the demands of today's email environment. Any concerns about scalability arising from expanding email volumes and retention times would be hallucination or alarmist fantasy.

Apart from their claims, no vendor goes to any great length to describe exactly how their architecture gives them scalability.

So what exactly does it mean to be scalable?

A scalable solution should achieve the following:

  1. Designed originally as a server-side technology, not a desktop client
  2. Deploys equally well across a local or networked cluster environment
  3. Possesses no single point of failure
  4. Deploys as a single implementation, leveraging any number of servers or processors
  5. Deploys across one or more processors, independent of operating system or platform
  6. Deploys across any number of servers, blades, or mixed environment systems
  7. Supports multi-tenancy and any number of departments, policies, or hierarchies of control
  8. Search and indexing leverages a clustered network, supporting billions of records
  9. Stores data locally or across the network to any media type or device (on, near, and offline) applying retention and lifecycle management to all data
  10. Uses hybrid storage model enabling short, intermediate, and long term storage of data
  11. Provides built-in clustering, high availability, failover, DR, and multiple-geo operation
  12. Deploys with comprehensive profiling tools, woven throughout application and platform
  13. Proven in carrier deployments >500,000 mailboxes, millions of emails/day
  14. Refined through many customer-years of deployments, load testing, and production use

THE RIGHT ARCHITECTURE

In order to achieve the degree of scaling necessary to meet the demands of today's enterprise email systems, the right architecture must be designed from the ground up. One capable of supporting carrier-class loads over the regulatory life of the data.

Unlike most major email archival solutions, which were designed in the early 1990s for corporate email storage management (maximum 3,000 to 4,000 mailboxes), ZL Technologies was engineered to support carrier-class email volumes (2M+ mailboxes).

Moreover, because ZL Technologies was designed to support 24x7x365 carrier environments, the architecture also required high availability, out-of-the-box clustering, and distributed processing for maximum throughput and failover.

Fundamentals of Scalable Architecture

Strong Weak
100% Java C/C++
Modular Monolithic
Open Standard Proprietary
Platform Point solution
Carrier Enterprise
Clustered Servers Single, Dedicated Server
Web/Application Arch. Client/Server Architecture
Platform Independent Platform Dependent
Multi-Tenancy Single Tenancy



Taken together, the above elements combine to form a powerful and scalable architecture for large organizations. But even with the freshest ingredients an award winning recipe can still be ill-produced. Similarly, much of the scalability and strength of the ZL Technologies architecture is based upon its implementation and development over time.

HISTORY - THE MAILSTORE: BREAKING THE UNBREAKABLE

In 1999, the use of Java for server-side applications was unproven, uncommon, and somewhat radical. However, unlike Java applets and client-side Java applications, server-side Java applications appeared to ZL Technologies to possess all the elements of a highly scalable development environment, which would enable rapid development of network capable email applications.

As shown by the runaway success of Websphere, and the entire J2EE development community, the decision to use Java as its development environment would be later prove to be prescient and the first of many ideal architectural decisions.

Another such decision was the use of a database to drive the messaging engine of the email management platform.

In 1999, ZL Technologies developed a metadata concept, which is now in common practice in the industry. The use of metadata dramatically increases the scalability of the system. Instead of storing raw data, the email management system places consistently sized metadata records into the database, metadata which in turn points to the actual, highly variable, raw mail data (body, and attachments), which are stored in a standard file system. Variability in size of the mail and files is to be expected and is perfectly suited for storage in the file system.

By combining the concept of metadata with an industrial database, ZL Technologies was able to implement a highly flexible and scalable mail storage architecture.

MTA MEETS GRID COMPUTING: circa 1999

At about the same time that ZL Technologies was overcoming the traditional limitations of email storage architectures, it was also redesigning the primary processing engine of email, the MTA (mail transfer agent).

Like many mail vendors in 1999 and even today, ZL Technologies built its original solution around the Sendmail MTA. Open-source and scalable, the Sendmail MTA seemed a good fit for ZL Technologies' carrier-class requirements. Why re-invent the wheel if it was already there?

It did not take long for ZL Technologies to realize the limitations of Sendmail, which curiously remains at the core of many existing solutions. ZL Technologies quickly decided that Sendmail could not be an effective long term solution.

The reason was simple: Sendmail's had limited real-world flexibility.

Although it could be customized through its own scripting language, only administrators versed in the encyclopedic volumes devoted to Sendmail scripting could successfully manipulate its many cogs and levers. To use Sendmail to perform complex workflow and multiple processes such as spam, virus, indexing, content filtering, lexical scanning, and other sophisticated mail management tasks was impractical. As such, its utility was marginal at best and unsuitable for ZL Technologies' need for flexibility.

The other problem was that Sendmail was designed around old ideas and assumptions about email processing. An email is picked up, processed, and sent. Very simple. However, ZL Technologies realized that email was evolving rapidly and would need an engine capable of performing multiple processes and tasks quickly and in parallel. Moreover, the engine would have to be able to take advantage of resources across an entire network and not be limited to the processing power of a single server or CPU.

Ultimately, no matter how an engine could process mail on a single server, it would never be faster than multiple engines processing across the network on many servers simultaneously.

Such was the strength of the Java platform, and ZL Technologies intended to use it.

Consequently, in 2000, ZL Technologies designed a 100% Java-based MTA, which was actually a general purpose task management and processing engine, capable of processing mail, files, virtually any type of data rapidly, in parallel, and use resources across the entire network.

As a result, ZL Technologies' mail management system represented one of the first Grid Computing architectures ever developed in the market.

ONE STORAGE DOES NOT FIT ALL

Due to cost considerations, most enterprises today utilize a variety of storage media and device types, ranging from online, near-line, and off-line solutions. And depending upon the type of data and whether retention is short, intermediate, and long term storage a different type of media or device is used.

Unlike most data, email has a complex and multi-stage life cycle. Consequently, at each stage, email requires a different type of storage solution. Unfortunately, most email management solutions are limited in their ability to function across multiple storage solutions simultaneously and are instead relegated to a single type of storage media.

ZL Technologies is designed to support a complete range of storage types, which is ideal for complex organizations comprised of different departments with different storage requirements and a wide variety of policies and associated regulations.

This is also ideal for long term storage of email data, because it allows firms to leverage low cost long term storage solutions such as tape or magneto-optical at appropriate times, freeing resources from high cost online storage devices like direct attached, NAS, or SANs.

ONE STORAGE ARCHITECTURE DOES NOT FIT ALL

In addition to the physical storage devices that are employed by the architecture, ZL Technologies is also designed around a hybrid storage architecture, which combines the strength and advantages of database technology with search index technology as email passes through its various life cycle stages.

Email goes through different stages from when it is first captured into the system to when it is finally discarded when its retention date is passed. These stages can be described by the actions taken on email at a particular time:

  • Stage 1: Capture - Processing, Scanning, Workflow, Review, Compliance Assessment
  • Stage 2: Short term Retention - Fast Retrieval, Workflow, Retention Extension
  • Stage 3: Long term Retention - Slow Retrieval, Retention Extension
  • Stage 4: End of Life - Deletion


Depending upon the stage, email experiences a different level of action or processing. During Stage 1: Capture, many actions are taken upon email. These actions evaluate the contents of the email, assign policy and attach metadata or value to the email. All of these actions require extensive auditing, tracking, and workflow and are highly dynamic. Many of these dynamic actions continue into Stage 2: Short term Retention.

For stages with the most dynamic actions and email processing, the ZL Technologies Hybrid Storage Architecture stores email metadata using database technology. This provides the greatest flexibility for dynamic data, enabling fast changes and modifications to the email metadata. For example, compliance reviews, annotations to the email, classification of the email and re-classification of the email significantly change the email metadata. This is best accommodated by an industrial strength relational database.

(Note: At the same time that it performs the above processes, ZL Technologies scans every email using a full-text search engine and stores this static content data into a search index.)

However, as time goes on and data reach Stage 3: Long term Retention and Stage 4: End of Life, two things happen. First, the email data grows increasingly static. The actions taken upon email are fewer and are non-invasive. That is to say there are fewer changes and any changes that do occur to the email metadata are simpler. Second, and more important, the amount of stored data increases tremendously.

Whereas databases are ideal for storing moderately large volumes of data (hundreds of millions of records), particularly records with dynamic, ongoing changes, they are not strong for storing extremely large volumes of data (tens of billions of records). Long term records are also roughly static in Stage 3 and Stage 4, as a result, they do not need the dynamic flexibility of a database.

For Stages 3 and 4, the architecture must focus on a different aspect of the email life cycle. Scalability, enough to store billions of records and retrieve email data quickly are now the most important criteria. Search engines are ideally suited for such a task. Consequently, when data enters into Stage 3 and Stage 4, the ZL Technologies Hybrid Storage Architecture moves its metadata from its database component to its search engine component.

Using the search engine component of the architecture for long term storage enables tremendous scalability of long term data, rapid search and retrieval, without burdening the database.

By using both components, the ZL Technologies Hybrid Storage Architecture is able to accomplish the broad range of tasks necessary to manage email across its entire life cycle.

This enables enterprises to perform sophisticated tracking, auditing, and compliance review of emails as required by law, as well as retain email for the number of years (3, 5, 7, 20, etc.) required by regulators or their own corporate policy.

PROVEN PRODUCTION DEPLOYMENTS

ZL Technologies measures its success in scalability and as an overall platform by the number of ongoing deployments in customer-years. The number of proven, production deployments in terms of customer-years has given ZL Technologies unique insight into the development of a scalable platform.

No matter how much test data or simulated deployments are run upon a system, there is no substitute for real customer deployments in production environments using real-world data.

ZL Technologies' first major customer deployment was for an international service provider, a carrier with 600,000 mailboxes. In 1999, ZL Technologies deployed and operated its own ASP, which today supports over 400,000 mailboxes. ZL Technologies' largest domestic deployment is with one of the largest Fortune 500 firms with approximately 150,000 employees. All deployments use the same ZL Technologies platform and architecture, leveraging the same design innovations and advantages.

But mailbox counts do not tell the entire story. In today's market, many businesses, particularly in certain industries have very high email volumes relative to their total mailbox counts. For example, securities firms, such as broker dealers tend to product ten times the amount of email that an average firm would with the same number of mailboxes. So, one need not have the most mailboxes to face the most scaling hurdles.

Supporting email volumes for production environments at this level are not easy. Even with all of its precautions, ZL Technologies, in the early stages experienced growing pains. In fact, growing pains are a fact of life when it comes to scaling. It is guaranteed. The question is not whether you experience growing pains. The question is whether the system you are using is prepared and designed to overcome and accommodate issues in scalability. Unless the solution is designed with comprehensive profiling, has complete code control, and the proper architecture, it is unlikely, no matter how smoothly it may run today, that when breakage occurs, that one will experience very painful, very costly outages, which will require many weeks to recover. For most of ZL Technologies' partner-customers, a few hours is unacceptable.

By focusing on innovative architectural designs, leveraging its existing distributed processing architecture, and focusing on a philosophy of scalability profiling and processes, ZL Technologies' email management platform has proven itself to effectively meet the growing volume of enterprise email data and regulatory retention requirements of even the largest global firms.

 
Request a Demo  or Contact us for more information.
   
More Info