Sunday, September 30, 2018

Practical thoughts on MS-DOS source code

I recently wrote that Microsoft released the source code to MS-DOS v1.25 and v2.0 on Github. This is a huge step for Microsoft, and I congratulate them for releasing these older versions of MS-DOS under the open source MIT license!

This source code release was significant because it resolved an issue from Microsoft's previous attempt to open the source code to older MS-DOS. In 2014, Microsoft released the source code to MS-DOS 1.1 and 2.0 via the Computer History Museum. Unfortunately, the license used in the Museum release was a "look but do not touch" license.

My understanding from lawyers who have explained it to me (I am not a lawyer) is that you can be "tainted" by knowledge of proprietary source code, under US law and under similar laws agreed to by partner countries. So anyone who read or studied the source code to MS-DOS 1.1 or 2.0 as it was previously released via the Computer History Museum license was not allowed to contribute to FreeDOS afterwards. We posted several notices to this effect on the FreeDOS website and elsewhere.

But this source code release of MS-DOS 1.25 and 2.0 uses the MIT License, which is not only a recognized open source software license, but compatible with the GNU GPL. This means the "taint" concern is effectively lifted.

While this is great, there's a practical side to the source code release. Note that these are very old versions of MS-DOS. FreeDOS has already surpassed these versions of MS-DOS in functionality and features. For example, MS-DOS 2.0 was the first version to support directories and redirection. But these versions of MS-DOS did not yet include more advanced features including networking, CDROM support, and '386 support such as EMM386.

It's great to see Microsoft open-source these old versions of MS-DOS, but what will be the practical impact on FreeDOS? I think Tom E. answered this well:
“Frankly, not so much. the relevant facts about MSDOS like internal structures, memory layout aso. have been re-engineered/disassembled, documented and commented by Andrew Schulman, Mike Podanowsky, and MANY others, and merged in an almost complete (and almost correct) documented DOS API by Ralph Brown. thanks to them, and there is close to nothing to be learned by studying old MSDOS sources.”

Eric A. adds a similar comment:
“Well, this is mostly interesting for historical research, MS DOS 1.25 had almost no features and 2.0 also is very far away from running most "normal" DOS software.”

So FreeDOS would not be able to reuse this code for any modern features anyway. But for basic features, such as weird edge cases or specific application compatibility, maybe developers can reference this code to improve FreeDOS.

Set your expectations appropriately. Thanks to Microsoft for releasing this source code under an open source software license (MIT) but don't expect this to have much impact on FreeDOS. We've already advanced well beyond MS-DOS 1.25 and 2.0.

Microsoft open-sources old versions of MS-DOS

Microsoft recently released the source code to MS-DOS v1.25 and v2.0 via a repo on Github. This is a huge step for Microsoft, and I congratulate them for releasing these older versions of MS-DOS under a recognized open source software license!

This source code release uses the MIT License (also called the Expat License). From Microsoft's LICENSE.md file on Github:
[MS-DOS 1.25 & 2.0 Source]
Copyright (c) Microsoft Corporation
All rights reserved.
MIT License
Permission is hereby granted, freeof charge, to any person obtaining a copy of this software and associateddocumentation files (the Software), to deal in the Software withoutrestriction, including without limitation the rights to use, copy, modify,merge, publish, distribute, sublicense, and/or sell copies of the Software, andto permit persons to whom the Software is furnished to do so, subject to thefollowing conditions:

The above copyright notice andthis permission notice shall be included in all copies or substantial portionsof the Software.

THE SOFTWARE IS PROVIDED *AS IS*,WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TOTHE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE ANDNONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLEFOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE ORTHE USE OR OTHER DEALINGS IN THE SOFTWARE.
(typos are from original; copied 9/30/2018)

This is the same as the MIT License recognized by the Open Source Initiative:
Copyright <YEAR> <COPYRIGHT HOLDER>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

And the same as the Expat License recognized by the Free Software Foundation:
Copyright (c) 1998, 1999, 2000 Thai Open Source Software Center Ltd

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

The Free Software Foundation (via GNU) says the Expat License (aka "MIT License") is compatible with the GNU GPL. Specifically, GNU describes the Expat License as:
"This is a lax, permissive non-copyleft free software license, compatible with the GNU GPL. It is sometimes ambiguously referred to as the MIT License."

Also according to GNU, when they say a license is compatible with the GNU GPL, "you can combine code released under the other license [MIT/Expat License] with code released under the GNU GPL in one larger program."

That means this source code release of MS-DOS 1.25 and 2.0 removes the concern of "taint" that we had with the previous MS-DOS source code, released via the Computer History Museum in March, 2014. Longtime FreeDOS users may recall that Microsoft posted the source code to MS-DOS 1.1 and 2.0 under a "look but do not touch" license that limited what you could do with the source code. Under the Museum license, users were barred from re-using the source code in other projects, or using concepts from the source code in other projects:
You may use, copy, compile, and create Derivative Works of the software, and run the software and Derivative Works on simulators or hardware solely for non-commercial research, experimentation, and educational purposes. Examples of non-commercial uses are teaching, academic research, public demonstrations, and personal experimentation. “Derivative Works” means modifications to the software, in source code or object code form, made by you pursuant to this agreement.
  • You may copy and refer to any documentation provided as part of the software.
  • You may not distribute or publish the software or Derivative Works.
  • You may not use or test the software to provide a commercial service unless Microsoft permits you to do so under another agreement.
  • You may publish and present papers or articles on the results of your research, and while distribution of all or substantial portions of the software is not permitted, you may include in any such publication or presentation an excerpt of up to fifty (50) lines of code for illustration purposes.
(emphasis mine)

I am not a lawyer, but even I can see this license does not allow users to re-use the MS-DOS source code, especially in open source software projects like FreeDOS. We saw this as a potential risk to FreeDOS; developers who had viewed the MS-DOS source code might "taint" FreeDOS if they later contributed to FreeDOS. To avoid this taint risk, we posted several announcements on the FreeDOS email lists and on the FreeDOS website, including on our FreeDOS History page, to warn FreeDOS developers that they should not view the MS-DOS source code. Anyone who did view the MS-DOS source code could not contribute to FreeDOS:
"Please note: if you download and study the MS-DOS source code, you should not contribute code to FreeDOS afterwards. We want to avoid any suggestion that FreeDOS has been "tainted" by this proprietary code."

But Microsoft's adoption of the MIT License is a significant change. The new MIT License is compatible with the GNU GPL. Therefore, the risk of taint seems to be removed. Congratulations to Microsoft for releasing MS-DOS 1.25 and 2.0 under an open source license!

Saturday, September 8, 2018

Code review: Using catgets/kitten to support different languages

This Code Review article is a repeat from last year, about how to use the Cats and Kitten libraries to support different spoken languages in your programs.

When you write a new program, you probably don't think about spoken languages other than your own. I am a native English speaker, so when I write a new program, all of my error messages and outputted text is in English. And that works well for the many people who have English as their native language, or who know enough English as a second language to get by. But what about others who don't speak English, or who only know a little English? They can't understand what my programs are saying.

The standard Unix method is with a set of C library functions built around language "catalogs." A catalog is just a file that contains all the error messages and other printed text from a program. In the Unix method, you have a different catalog for every language: English, German, Italian, Spanish, French, and so on.

The FreeDOS Cats library was a stripped-down implementation of the Unix library, using a very simple method. Every time you want to print some text in the user's preferred language, you first look up the message string from the catalog using the catgets() function—so named because it will get a string from a message catalog.

In Unix, you use catgets() this way:

  string = catgets(cat, set, num, "Hello world");

This fetches message string number num from message set set, from language catalog cat. The organization of messages into sets allows developers to group status messages into one set (say, set 1), error messages into another set (such as set 7), and so on.

Before calling catgets(), you need to open the appropriate language catalog with a previous call to catopen(). Typically, you have one catalog per language, so you have a different language file for English, another for Spanish, etc. Before your program exits, you close any message catalogs with calls to catclose().

If the string doesn't exist in the message catalog, catgets() returns a default string that you passed to it; in this case, the default string was "Hello world."

I implemented a simplified version of these functions in a FreeDOS Internationalization library called Cats. To save on memory, Cats supported only one open catalog at a time. If you tried to open a second message catalog, the call to catopen() would return an error (-1).

Message catalogs were very simple under Cats. Implemented as plain text files, Cats loaded the entire message catalog into memory at run-time. In this way, you didn't need to recompile the program just to support other languages; you just added another message catalog file for the new language. An English message catalog for a simple program might look like this:

  1.1:Hello world
  7.4:Failure writing to drive A:

The same message catalog in Spanish might look like this:

  1.1:Hola mundo
  7.4:Fallo al escribir en la unidad A:

For example, the string "Failure writing to drive A:" is message number 4 in set 7.

The Cats library was a simple way for developers to add support for different languages in their programs, written in C. And because Cats implemented a Unix standard, it made porting Unix tools to FreeDOS much easier. Once you added the calls to catgets(), all you needed to support other languages was a message catalog that someone translated to a different language. And I kept the Cats message catalogs very simple; they were plain text files.

Cats was a neat innovation, but loading the messages into memory was cumbersome because it used streams. Other FreeDOS developers improved on Cats to optimize the loading of catalogs, reduce memory footprint, and add other enhancements. The new library was noticeably smaller, so we renamed it Kitten.

Because of the optimizations, Kitten used a slightly different API. Since Cats only supported one message catalog at a time anyway, Kitten removed the cat catalog identifier. Once you open a message catalog with kittenopen(), all calls to kittengets() assume that message catalog. You only need to make a single call to kittenclose() before you end the program.

Using Kitten made it much easier to support different spoken languages in FreeDOS programs. Here's a trivial example to put it all together:

  /* test.c */

  #include <stdio.h>
  #include <stdlib.h>
  #include "kitten.h"

  int
  main(void)
  {
    char *s;
 
    kittenopen("test");
 
    s = kittengets(7, 4, "Failure writing to drive A:");
    puts(s);
 
    kittenclose();
    exit(0);
  }

This loads a message catalog "test" into memory, then retrieves message 4 from set 7 into a string pointer s. If message 4 in set 7 isn't found, kittengets() returns the default string "Failure writing to drive A:" into s. The program prints the message to the user, then closes the message catalog before exiting.

Typically, you name the message catalog after the program's name. So the message catalog for the FreeDOS CHOICE program is "choice". Kitten searches for the language file in a few locations on disk, and always appends the value of the %LANG% environment variable, which is typically set to the two-letter language abbreviation: "en" for English or "es" for Spanish. The DOS filename for the English version of the "choice" language catalog is CHOICE.EN, and the Spanish language version is CHOICE.ES.

A limitation to Cats and Kitten is that it can only support single-byte character sets supported by DOS. So Cats and Kitten cannot support Chinese or Japanese, but should do fine with most other languages.
You can find Cats and Kitten at the FreeDOS files archive on ibiblio, under devel/libs/cats/

We made three revisions to Kitten, so the latest version is Kitten revision C, which you can download directly as kitten-c.zip.

FreeDOS contributor Mateusz "Fox" Viste wrote a similar implementation for Pascal programs, called Cubs. You can also find it on ibiblio, under devel/libs/cats/cubs/

FreeDOS developer Eric Auer created a command-line version of Kitten, named Localize, so you can provide internationalization support for DOS Batch (BAT) files. Find it on ibiblio, under devel/libs/cats/localize/