The future of storage according to Phison SSD


Phison already offers on-the-fly encryption on our Opal and FIPS 140-2 SSD products. As mentioned above, this works because it is a capability that can operate on data that is already going to the SSD. Compression is easy to accommodate on the SSD and aligns with the streaming model concept, but it provides limited benefit given that most of the bulk data (Photos, Video or Music) is already fully compressed. There are large data sets that can benefit from compression, but the use-case is relatively uncommon, so it tends to be delegate to dedicate server appliances.
The case for dedupe breaks the streaming model for several reasons:
1) It requires a huge amount of memory to track the hashes for each sector.
2) SSD’s are already fully tasks in datacenter environments, so any work spent searching is taken away for host IO
The only real benefit in having the SSD perform the search is a slight reduction in PCIe bus transfer time and a reduced load on the host CPU. Conversely the SSD has to go up in cost due to higher computational requirements and additional DRAM. Its active power also necessarily has to go up. The dedupe problem is better implemented using spare system resources, particularly over night when people are sleeping, instead of adding 10-20% SSD.
A type of computational hybrid devices exist today and it is very successful: Smart NIC. They combine a high speed NIC (typ. 10 GB/s) with a powerful CPU or FPGA. Though this combination works for NIC, it does not work as well for storage. The reason is fairly straight forward. The Smart part of the NIC is processing data that is already passing through the NIC to the host. The Smart NIC works well when it can process data as it streams through or when the Smart NIC is capable of servicing a request by directly accessing resources within the chassis.
The typical value proposition for Computational Storage is presented as followed: the SSD is closer to the data, it frees up bus bandwidth and it offloads the host CPU. At face value Computation Storage appears to be an easy sell, but it hasn’t turned out that way.
First the SSD today is already using 100% of it’s resources and power budget to service its primary function. In many cases, high density enterprise SSD have to limit performance to avoid exceeding their power or cooling budget. Second the SSD are typically using small CPU cores that are nowhere near what the host CPU or a GPU can do. Third, this experiment has already been tried before Computation Storage was a buzzword. One company attempted to combine a GPU and SSD, but the solution ended up degrading both technologies. To meet the GPU requirements, the SSD had to run very fast and add significant heat load to the GPU. The GPU is much hotter than an SSD and created substantial retention stress on the NAND. Lastly, an SSD is a consumable item that has a finite write bandwidth, whereas a GPU can run indefinitely until it becomes obsolete.
Taking a different approach, we could add a more powerful CPU directly on the SSD. Then we run into the RAM problem. Today most enterprise SSD maintains a 1000:1 NAND to DDR ratio. The SSD only needs to pull a few bytes for every 4K LBA translation so the DDR bandwidth is relatively low. This means SSD can use slower grade DRAM which lowers the entire module cost. Adding a larger guest CPU to the SSD along with more DDR for applications decreases the power available for the SSD’s primary role of providing IO to the main host. It also increases the SSD cost, but does not provide a proportional gain in compute power.
Then there is the problem with how storage is deployed today that has to be addressed. Data is usually aggregated into multi-unit RAID sets and so no one SSD will ever see the full data set. We could change the way storage is used, ensuring each SSD always sees complete data elements and use full replication to ensure redundancy. This is not likely to take hold because this model does a poor job of sharing storage bandwidth if one SSD contains more data that is currently needed. RAID stripes address this problem by staggering the accesses so that each subsequent client starts shortly after the current client. We could extend the model where each SSD has a full copy of a data set by implementing replication across multiple units, but then we have to add a lookup and load share mechanism. Duplication also has a much higher storage footprint than simple RAID5 or RAID6. Simply put, the way we use storage today is cost effective, easy to deploy and works well for most scenarios. Completely changing the storage infrastructure for what amounts to adding a few server CPU is hard to justify.
Despite the downside for general purpose Computation Storage, there are specific cases it does make sense. It occurs when the storage use-case mirrors the winning case for Smart NIC. That is to say that the SSD only has to process the data once as it moves through the device. We can associate encryption and compression with computational storage, but that’s a stretch. It is more accurate to define these two use-cases as in-line or streaming data processing using a very simple algorithm.
Phison and one of our customers developed a product where we have found a Computational Storage application that is well suited to the SSD. It does not require a large amount of memory or CPU power and does not interfere with the primary purpose of the SSD which is storage IO. We are developing a security product that uses machine learning to look for signs the data is being attacked. It can identify ransomware and other unauthorized activities with no measurable impact on the SSD performance.
Phison already offers on-the-fly encryption on our Opal and FIPS 140-2 SSD products. As mentioned above, this works because it is a capability that can operate on data that is already going to the SSD. Compression is easy to accommodate on the SSD and aligns with the streaming model concept, but…
Recent Posts
- The Best Meta Quest Games You Can Play Right Now (2025)
- ASUS is making a ‘Fragrance Mouse,’ and it’s coming to the US
- Lost Records: Bloom & Rage blends its teen drama with a heavy dose of ’90s nostalgia
- NYT Connections hints and answers for Sunday, February 23 (game #623)
- Bored of the zombies in The Walking Dead? MGM Plus’ Earth Abides is a refreshing change to the usual dull post-apocalypse series
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010