High availability is a prerequisite for edge computing. Here’s what availability is, how you can make it more available, and why high availability is essential for edge computing.
What is availability?
Have you ever had trouble with your PC? In such a case, the PC will not be usable until it is repaired for the original purpose. You will not be able to use your PC until you fix it. PCs with low utilization rates in such cases are called “low availability”. On the other hand, systems with high operating rates are called “highly available” due to situations where they are hard to break and continue stable operation, or because they have a system that can be repaired immediately even if they fail. In other words, you can think of availability as to how much you can use your system when you want to use it. It is important to note that there are words that have similar meanings such as “reliability” and “maintainability”. “Reliability” is primarily a type of difficulty in breaking a system. The case for breaking is how much trouble you can operate after you start up. Reliability is generally expressed on a time basis using an indicator called the mean time between failures (also known as MTBF for “Mean Time Between Failure”). The higher the MTBF value, the longer the interval between failures, which means that it is more reliable.
Another “maintainability” represents ease of maintenance and repair. In other words, how long will it take after trouble occurs? It is common to think about this on a time basis, which uses an indicator called Mean Time To Recovery (MTTR) in English. MTTR, contrary to MTBF, is better with a lower number. In other words, the shorter the time to recovery, the more maintainable it is.
Availability is generally called “utilization rate”, which is the sum of MTTR and MTBF. In other words, the relationship between improving reliability and maintainability also improves availability.
Since any system, including production systems, should be as high as the cost allows, various measures have been made to increase availability. Let’s take a look at it specifically.
How can I increase availability?
- Provide redundancy
It is common to provide “redundancy” to the system to increase redundancy availability. One example of redundancy is having two identical systems. If you run two systems alternately, you can continue to run on the other if one stops. There is one running system, but it is a big advantage to be able to continue using the same system at all times. In familiar places, airliners always have two or more engines to ensure redundancy and availability. - In the case of information systems
one way to use a stable version of the OS and software is a proven track record. This software is quick to respond to problems, and even if there is a bug or trouble, they are often resolved immediately, so a relatively stable operation can be expected. - In the case of production systems
the approach of improving serviceability is also one way to increase availability. These days, sensors are often installed in manufacturing equipment with the aim of reducing maintenance time and predicting failures. By detecting unnatural movements of products and equipment with sensors, processing them with edge servers, and quickly notifying repair personnel, it is possible to reduce the time between failure and repair start (remote monitoring). In addition, if the edge server has the learning ability, it is possible to predict failures. This is called predictive maintenance, and edge computing is considered to be a technology that greatly contributes to this realization.
The need for high availability in edge computing
Based on the basic knowledge of availability seen so far, let’s consider the semiconductor production line as a concrete scene where high availability is required. The main enemies of the semiconductor production line are dust and dust. The reason is that semiconductors are designed in nanometers (one billionth of a meter), and of course, they perform very fine work in manufacturing. For this reason, even dust that is invisible to the human eye can have a big impact in the nanometer world.
Dust moves in the air, so they diffuse when the airflow is disturbed. For this reason, it is necessary to always control the flow of air, and in clean rooms, air flows from top to bottom without fail, called downflow. Furthermore, since the air is disturbed even when opening and closing the door, it is thoroughly possible to prevent the influence as much as possible by making the entrance and exit of the clean room a double door.
However, human beings are the elements that break these strict mechanisms. Humans move, by all means, so the air flows. In addition, sweat and breathing can contaminate semiconductor wafers. These human pollutants come out by all means as long as they are active by humans, and cannot be stopped.
For this reason, in clean rooms, the unattended operation is often performed without putting human as much as possible except for line start-up and maintenance. Remote monitoring with edge computing, as mentioned earlier, is considered very effective in these environments.
Another reason for the high availability is that the total price of the line is extremely high. Since semiconductor manufacturing is very finely processed, precision close to the limit is often required, and it is not uncommon for the line capable to exceed hundreds of billions to in some cases more than 100 billion yen. In addition, since it is also a market structure in which the wave of demand changes significantly, it is common to make capital investments to launch a production line at once. In order to recover the capital investment as soon as possible, it will operate 24 hours a day, 365 days a year. In a sense, it is natural that the availability of manufacturing equipment is required at a high level in such an environment.
What if you deploy low-availability edge computing on these highly available production lines? Essentially, the function required of an edge server is not only to collect product data but also to monitor sudden outages and defects in manufacturing equipment and notify the department in charge. In other words, it is necessary to constantly monitor defects and troubles that do not know when they will occur. If the edge servers that need to be monitored at all times are stopped frequently, they will not be able to fulfill the necessary role of picking up manufacturing equipment troubles. In other words, the high availability of edge servers can be a fundamental feature required of edge servers.
As far as cost is concerned, you should choose a high availability for edge servers.
These days, the sales cycle life of various products has been shortened. As a result, the style of quickly launching production lines and recovering capital investment at once is no longer so unusual for manufacturing. This also means that unattended operations and 24/7 operations are becoming more and more commonplace. To achieve the high availability required in these environments, you might want to deploy edge computing in your production systems. However, the edge server itself must also be highly available to match the production system. Therefore, as far as cost is acceptable, it is best to choose the one with high availability of edge servers.