OS and Hard Drives

Computer Operating Systems should be divested of their direct control of their mass storage devices. Their control should be indirect as opposed to direct.

During the early stages of computer and mass storage development, the storage devices were small in capacity and quite expensive. In comparison to the computers, they were quite complex. Now the situation has changed. The storage devices have increased in capacity by several orders of magnitide to levels unthinkable in the early ages of computers. With respect to modern computers, and with the complexity of the controllers invested in but a few components (microprocessors), the mass storage devices of today are much simpler that before. (from the user perspective that is)

Yet, we continue to drive them with extensive amounts of code from within the operating system of each computer. I propose that all low level control of the mass storage devices be extracted from the Operating Systems and moved on to the storage device.

Rather than searching the disk structure for a given file, the OS (Operating System) should send a message to the mass storage device that effectively says, for example:
Please open file C:\boot_loader.exe for reading and put the contents in {this address}.
The disk controller would find the file, open it, and DMA the data to the location specified. Under this scheme, the OS does not know or care where the file is located, it just asks for it and gets it.
Further, the OS no longer cares how much capacity the mass storage device contains. It has no use for platter count, head count, sector size, CRC, bad sectors, or any of the myriad of details that it currently deals with.

The mass storage controller can off load this computing burden from the OS.

But wait, there’s more!


Once the mass storage device handles all the details of managing the data, it can do much more. If the OS asks for file C:\user\mydata\some_file.txt the controller can translate the pathname into any format it wants. The OS no longer cares. When I connect the mass storage device to another computer (logically, virtually, or physically), the controller adapts. If the OS now asks for C:user/mydata/some_file.txt, it is trivial for the controller to translate that name into what ever internal format it uses, open the file and provide the contents. Based on the format of the pathname, the controller can send the data to the requestor with or without CR or LF or TAB or any other character that the OS might or might not be expecting.
Under this scheme, the OS is relieved of a huge amount of processing requirements. Under this scheme, mass storage devices can be connected to different computers and will work just fine. Format compatibility problems dissappear.

Indeed, There is no need for the OS to format a drive any more. Tell the drive to format itself. It can do so much faster and more efficiently all by itself. Every drive can have a true erase function. And if you are in a hurry, any number of drives can format or erase themselves at the same time. As the OS no longer has any concern with the low level, or any level of disk formatting, that is a task it can dispense with.
This can be a significant advantage if you work an an environment that requires strict control over security. Adding a single button to the a mass storage device can cause it to perform a pre-approved secure erasure function. If you are really worried, bury a small battery inside the drive. Once the erase function is started, nothing can stop it, not even power removal. But that’s another topic.

And the List Continues

I propose that each mass storage device have a hardware switch that can be used to enable and disable writing to selected files.

When writing is enabled, the entire device is writable. While writable, the OS can create and edit a list of protected files kept on the device. When the physical switch is set to the protect mode, the protected files cannot be modified. (Obviously, the list would be protected.) This concept would be implemented in the ROM code of the mass storage controller. It must be a hardware switch. If it is under software control, someone, somewhere, will write some software than can defeat it. If it is a hardware switch that I must reach over and flip, no amount of software will ever defeat it. (Obviously the controller software must be well implemented.)
When I am updating my OS or installing critical software, I simply flip the switch and go. Then I return it to the protect mode until the next update.

Implementation

When first writing this page, it seemed that the idea was certain to be rejected. Who (which OS producer) would make the first step and which hard drive manufacturer would provide the first round of hardware.

As is often the case, it is much simpler than originally thought.

Begin with USB and the mass data transfer mode. The host requests a large amount of data and the device provides it in bursts of up to 480 megabytes per second. If the host asks for the boot_record, then the USB device can provide it. The host doesn’t care where the boot_record is stored on the device, just that it gets it.

I have a book on USB design but it doesn’t really tell me about the low level nature of mass storage transfers that already exist. For now, I think it safe to assume that when fetching data from a thumb drive, the host asks to the name of the file rather than the data found on platter X, using head Y and sector Z. This is the crux of my concept. Keep that data within the mass storage device and relive the host of dealing with it. As I recall, some Operating Systems can boot from USB. We may be half way there.

Regarding real hard drives, working with them may be a tiny bit more difficult, but well within the realm of the home computer enthusiast.

Many small computers can be purchased with 100 Mbit or gigabit Ethernet and the standard IDE (or other) drive interface. Load a standard Linux OS on it and attach some disk drives. Now write some code to accept commands from the local LAN to serve up the data on the drives. Write the code to recognize any type of path name and file name syntax, fetch the data from the drives, and send it back to the requestor with the same format. Single board computers with an IDE are available in the $100 to $200 range, probably less.

Now that you have an Ethernet based server you can install any type of RAID service and keep it completely transparent to the host or multiple hosts. Its not trivial, but one could use USB, Firewire, or any other communications method to serve the hard drives.

Back to USB

There is an extension to USB 2.0 called OTG or On The Go. OTG was originally implemented to allow things like a cell phone to talk to your computer as a device, and at other times, to talk to your printer as a host. It can switch roles. Now it’s even easier to get your new disk farm server communicating via USB. Cameras, cell phones, and other devices can plug in and send their data directly to your disk server.

Side note: There are OTG disk drives available to purchase that may be able to do this for you right now. You could probably rig up a battery pack for one of these and have all the picture space you might need in the field.

Summary

The individual hardware components are available to relieve the OS of the trouble of managing its mass storage. Its time to take the step and implement.

If you like this idea, please link, write, email, or blog about it. Please send your comments to the various OS writers and mass storage makers. I would like to hear from you at email address webmail at bkelly dot ws

Thank you.
December 2007