We ❤️ Open Source
A community education resource
How I built a Markdown-to-HTML tool on a 5MB FreeDOS system
Here’s what bootstrapping looks like on a floppy-sized operating system.

There’s an old saying in some circles that “if I could have only one program, it would be a compiler and I’ll build the rest myself.” And I agree with that! With a good compiler, you can do a lot, including creating your own tools. That’s how I got “into” programming, long ago: With a bit of programming knowledge, I learned I could create programs that fit my specific use case and did things how I needed to do them.
I wanted to demonstrate how small of a system I can start with and still be able to build useful tools. For this experiment, I started with FreeDOS, using just a simple editor and C compiler. From there, the challenge became what first tool do I need to create so I can do useful work? I do most of my writing in Markdown, so I set a personal goal to create a minimal markup system, which converts Markdown-like syntax into HTML.
Read more: 5 FreeDOS editors I love
A minimal test system
I needed to have a minimal environment to get started. I don’t need a lot of space for this experiment, a few megabytes of space will be enough. To create a 5 MB virtual disk for my experiment, I ran this QEMU
command:
$ qemu-img create -f qcow2 tiny.qcow2 5M
After that, I needed to install a minimal version of FreeDOS. I could have booted the FreeDOS 1.4 LiveCD and used the “live” environment to partition the virtual disk, format it with a filesystem, and install a C compiler. But FreeDOS 1.4 includes a lot of tools, too much to include everything on just the LiveCD; The development tools (including the C compiler) are available only from the BonusCD.
Instead, I took a shortcut and booted from an existing FreeDOS system, and used the command line to set up the new tiny virtual disk. I used QEMU to boot a minimal virtual machine using QEMU, with my existing FreeDOS disk as the first drive (hda
) and my new tiny disk as the second drive (hdb
):
$ qemu-system-i386 -hda freedos.qcow2 -hdb tiny.qcow2
After the virtual machine booted to FreeDOS, I used the command line to partition the tiny virtual disk. Normally, I might use the fdisk
program in “interactive” mode, using the menus to select the second disk and create the partition. But all I needed was to define a single partition that used the full space on the tiny disk, and marked it as “Active” so it would be bootable. The fdisk
program supports an /auto
command line option to do all of that for you:
C:\>fdisk 2 /auto
Like any DOS, FreeDOS reads the disk partition information exactly once, at boot time. You need to reboot before FreeDOS can “see” the new partition.
After rebooting, I used the format command to create a DOS filesystem on the tiny disk. The /v
option defines a “Volume label” for the new disk, and the /s
option transfers the “System” files so the new disk will boot. While fdisk
uses numbers to identify each disk, DOS uses letters: A
and B
are always reserved for the floppy drives, even if they are not present. Hard disk partitions are C
and D
and so on. To format the new partition on the second drive, specify it as the D
drive:
C:\>format D: /v:TINY /s
After creating the filesystem, format
will transfer a copy of the FreeDOS kernel and command interpreter to the tiny drive. This is minimally what you need to boot a DOS system:
C:\>dir D:
Volume in drive D is TINY
Volume Serial Number is 1A34-1801
Directory of D:\
KERNEL SYS 46,256 04-02-25 2:22p
COMMAND COM 87,772 02-21-25 9:06a
2 file(s) 134,028 bytes
0 dir(s) 4,968,448 bytes free
To do my experiments, I’ll need to have an editor. Edlin is a very simple line editor, similar to ed
on Linux and Unix, although different in the details. I copied Edlin to the new drive like this:
C:\>copy C:\freedos\bin\EDLIN.EXE D:
Finally, I copied a C compiler to the tiny disk. FreeDOS includes several C compilers to choose from, such as the Open Watcom C compiler, which is quite nice. We also provide an IA-16 version of the GNU C Compiler, which provides the same C compiler experience as Linux systems. But both of these C compilers are quite large, too big for the tiny 5 MB hard disk in this experiment. Instead, I used a copy of BCC, or “Bruce’s C Compiler.” This is a minimal C compiler that supports most ANSI C features. And at less than 1 MB in size, it’s perfect for this tiny disk. To copy everything from BCC to the new disk, use the xcopy
command:
C:\>xcopy /s C:\devel\bcc\ D:\bcc\
Booting the tiny system
With the new disk set up with a minimal DOS environment, I was ready to get started! But the new system doesn’t have any DOS tools, other than the editor and C compiler. This new system doesn’t have any drivers to support memory beyond the first 640 kilobytes; DOS needs a driver to support expanded or expanded memory—both refer to using memory, but using different memory management models. Since I can’t address more than 640 kb with just the tools I’ve installed on the tiny system, I don’t need to define the virtual machine with much memory. I also don’t need to include other machine options for a more modern system; Using a very old “ISA” PC definition will be enough. The smallest that I can define a virtual machine using QEMU is 1 MB, so I booted the new system with this QEMU command:
$ qemu-system-i386 -m 1 -machine isapc -hda tiny.qcow2
When any DOS system boots up, the kernel first looks for a file called CONFIG.SYS
that lists certain parameters that DOS can use to define the system. Without it, DOS will assume certain defaults, such as where to find the command shell (usually COMMAND.COM
). The command shell reads a “batch” file called AUTOEXEC.BAT
to set up the environment. If this file doesn’t exist, the shell will prompt the user for the date and time:

To keep this from happening every time I boot the new system, I used the Edlin editor to write a new AUTOEXEC.BAT
file. To get started, I just needed to set the path where the shell looks for programs; The Edlin editor is in the root path (\
) and the C compiler is at \BCC\BIN
. For write my first AUTOEXEC.BAT
file, I can use the Edlin editor.
The Edlin editor is a line editor, similar to ed
on Linux. Edlin supports similar commands as ed
, although they are slightly different. If you use the ?
command in Edlin, you will see these supported commands:
# edit a single line [#],[#],#m move
a append [#][,#]p page
[#],[#],#,[#]c copy q quit
[#][,#]d delete [#][,#][?]r$,$ replace
e<> end (write & quit) [#][,#][?]s$ search
[#]i insert [#]t<> transfer
[#][,#]l list [#]w<> write
Using a
will always append at the end of a file. When you are adding new lines, Edlin changes the prompt from *
to :
to indicate you should enter text into the file. Use .
on a line by itself to tell Edlin that you are done entering new text. The w
command writes the file back to the disk, and q
exits the editor. Edlin also supports “backslash” sequences, such as \t
to enter a tab character; To type a literal “backslash” character, use \\
instead.
C:\>edlin autoexec.bat
autoexec.bat: New file.
*a
: PATH \\;\\bcc\\bin
: .
*w
autoexec.bat: 1 line written
*q
Really quit (Y/N)? y
With this AUTOEXEC.BAT
file, FreeDOS will always set up the search path for programs, and it won’t prompt me for the date and time.
Setting up the environment
The FreeDOS command shell (called “FreeCOM”) includes a number of built-in commands. These “internal” commands already provide the core features that you need to be productive in a minimal environment. Type the ?
command at the prompt to see the internal commands you can use:
C:\>?
Internal commands available:
ALIAS BEEP BREAK CALL CD CHDIR CDD CHCP
CLS COPY CTTY DATE DEL DIR DIRS DOSKEY
ECHO ERASE EXIT FOR GOTO HISTORY IF LFNFOR
LH LOADHIGH LOADFIX MEMORY MD MKDIR PATH PAUSE
PROMPT PUSHD POPD RD REM REN RENAME RMDIR
SET SHIFT TIME TITLE TRUENAME TYPE VER VERIFY
VOL ? WHICH
Features available:
[aliases] [enhanced input] [history] [filename completion] [last dir] [long file
names] [XMS swap] [installable commands] [DOS NLS] [directory stack (PUSHD)]
For example, the alias command defines an alias from one command to another. If you don’t like to type DIR
every time to see the list of files, and instead prefer the Linux ls
command, you can define an alias to DIR
that uses all lowercase (/l
) and a “bare” format (/b
) that doesn’t display the disk volume label or free space, with files displayed in a wide format (/w
) and ordered (sorted) by file name and extension (/o:ne
):
C:\>alias ls=DIR /l /b /w /o:ne
C:\>ls
autoexec.bat [bcc] command.com edlin.exe kernel.sys
The odd spacing is because DOS file names can only be 8 characters long, plus 3 characters for the file extension. With this assumption, the wide listing assumes each “column” of output can only be 12 characters long, plus a few spaces between each to make it readable. That means a wide listing will show only five entries on each line.
The internal commands also include cls
to clear the screen, copy
to copy files, erase
or del
to delete files, type
to display the contents of a file (like cat
on Linux), rename
or ren
to rename files, mkdir
or md
to create directories, rmdir
or rd
to remove directories, cd
or chdir
to change into a new directory, pause
to wait for the user to press a key (useful in batch files), and other commands.
To set up the tiny environment to look a bit more like Linux, add a few aliases to the AUTOEXEC.BAT
file. These lines add aliases for several common Linux commands:
C:\>edlin autoexec.bat
autoexec.bat: 1 line read
*a
: alias ls=DIR /l /b /w /o:ne
: alias rm=DEL
: alias cat=TYPE
: alias cp=COPY
: .
*w
autoexec.bat: 5 lines written
*q
Really quit (Y/N)? y
You don’t have to reboot for these changes to take effect; Just run the AUTOEXEC.BAT
file and it will define everything as though you had rebooted the system. You will also see each line from AUTOEXEC.BAT
printed on the command line as the shell “runs” each line in the file:
C:\>autoexec
C:\>PATH \;\bcc\bin
C:\>alias ls=DIR /l /b /w /o:ne
C:\>alias rm=DEL
C:\>alias cat=TYPE
C:\>alias cp=COPY
Read more: How to write your first FreeDOS program
Writing my first program
With this environment, I can start writing programs. I’ll do all of my work in a new directory called src
so I can keep my files together. For that, I’ll need to make a new directory with the mkdir
command:
C:\>mkdir src
One useful program that isn’t provided as an internal command is a file viewer, so let’s write that as my first program. The simple way to write a file viewer is to read one character at a time from the input, and print each one to the output. As the program prints each character, it should track how many characters it has printed on a line; After printing 80 characters, the screen will effectively “roll” to the next line, so that should count as printing a new line to the screen. The DOS screen is 80 columns wide and 25 lines long, so the program should prompt the user after printing 24 lines.
Let’s start by writing a program that just reads from the input and prints to the output:
C:\SRC>edlin more.c
more.c: New file.
*a
: #include <stdio.h>
:
: int main()
: {
: int c;
:
: while ((c=getchar()) != EOF) {
: putchar(c);
: }
:
: return 0;
: }
: .
*w
more.c: 12 lines written
*q
Really quit (Y/N)? y
Compile this program using the BCC compiler. BCC is a very early C compiler that actually converts C programs to assembly, then uses an assembler to generate “COM” programs for DOS. The compiler assumes the original “K&R” C program syntax by default, but you can use ANSI C prototypes using the -ansi
command line option; This program doesn’t use prototypes, but let’s use that option anyway so the compiler will be more familiar to use. Add the -o
option to save the compiled program to a specific output file name:
C:\SRC>bcc -o more.com more.c
This program only reads from the input and prints to the output, without pausing, which is basically no different from a very simple cat
command:
C:\SRC>more < more.c
#include <stdio.h>
int main()
{
int c;
while ((c=getchar()) != EOF) {
putchar(c);
}
return 0;
}
Let’s add some extra code to track the number of lines printed, and the number of characters on each line. In Edlin, list the contents of the file with the l
command, then use the i
command to insert new text at a specific line. Avoid using the a
command; This always appends new lines at the end of the file.
C:\SRC>edlin more.c
more.c: 12 lines read
*l
1:*#include <stdio.h>
2:
3: int main()
4: {
5: int c;
6:
7: while ((c=getchar()) != EOF) {
8: putchar(c);
9: }
10:
11: return 0;
12: }
*6i
: int line=1,col=1;
: .
*l
1: #include <stdio.h>
2:
3: int main()
4: {
5: int c;
6:* int line=1,col=1;
7:
8: while ((c=getchar()) != EOF) {
9: putchar(c);
10: }
11:
12: return 0;
13: }
*10i
: if (c=='\\n') { line++; col=1; }
: else if (++col>80) { line++; col=1; }
:
: if (line==25) {
: puts("MORE");
: }
: .
*l
5: int c;
6: int line=1,col=1;
7:
8: while ((c=getchar()) != EOF) {
9: putchar(c);
10: if (c=='\n') { line++; col=1; }
11: else if (++col>80) { line++; col=1; }
12:
13: if (line==25) {
14: puts("MORE");
15: }
16:* }
17:
18: return 0;
19: }
*w
more.c: 19 lines written
*q
Really quit (Y/N)? y
I’ve tried to save a bit of space on lines 10 and 11 by writing out the if and else if statements as single lines. This makes my file more manageable when editing using a line editor like Edlin.
The updated program will display a MORE
prompt after 25 lines, but it won’t actually wait for the user to press a key. To add that feature, we can use the getch
function from the conio
set of functions; This requires including the conio.h
header file. BCC’s implementation of getch
uses the INT 16
interrupt to get a keystroke from the keyboard. This returns the BIOS scan code in AH
(the upper byte) and the ASCII character in AL
(the lower byte). DOS is a 16-bit operating system, so the int
return value for getch
is 16-bits, or two bytes. We can verify the values by writing a quick program to read a single keystroke and print the values:
C:\SRC>edlin key.c
key.c: New file.
*i
: #include <stdio.h>
: #include <conio.h>
: int main()
: {
: int k,ah,al;
: k = getch();
: al = k & 255; /* lower byte */
: ah = (k>>8) & 255; /* masking with 255 shouldn't be needed,
: but I like to play it safe anyway */
: printf("AH=%d, AL=%d\\n", ah, al);
: return 0;
: }
: .
*w
key.c: 12 lines written
*q
Really quit (Y/N)? y
Compile this program, run it, and press Esc on the keyboard to see its value. The Escape key is always ASCII value 27.
C:\SRC>bcc -ansi -o key.com key.c
C:\SRC>key
AH=1, AL=27
We can use this to make a final modification to the more program to read a keystroke from the keyboard, and abort the program if the user presses the Esc key:
C:\SRC>edlin more.c
more.c: 19 lines read
*l
1:*#include <stdio.h>
2:
3: int main()
4: {
5: int c;
6: int line=1,col=1;
7:
8: while ((c=getchar()) != EOF) {
9: putchar(c);
10: if (c=='\n') { line++; col=1; }
11: else if (++col>80) { line++; col=1; }
12:
13: if (line==25) {
14: puts("MORE");
15: }
16: }
17:
18: return 0;
19: }
*2i
: #include <conio.h>
: .
*10,$l
10: putchar(c);
11: if (c=='\n') { line++; col=1; }
12: else if (++col>80) { line++; col=1; }
13:
14: if (line==25) {
15: puts("MORE");
16: }
17: }
18:
19: return 0;
20: }
15:* puts("MORE");
15: fputs("--MORE--", stdout);
*16i
: fflush(stdout);
: if ((getch() & 255) == 27) { return 0; }
: .
*16,$l
16: fflush(stdout);
17: if ((getch() & 255) == 27) { return 0; }
18:* }
19: }
20:
21: return 0;
22: }
*18i
: putchar('\\n');
: line = 1;
: .
*l
9: while ((c=getchar()) != EOF) {
10: putchar(c);
11: if (c=='\n') { line++; col=1; }
12: else if (++col>80) { line++; col=1; }
13:
14: if (line==25) {
15: fputs("--MORE--", stdout);
16: fflush(stdout);
17: if ((getch() & 255) == 27) { return 0; }
18: putchar('\n');
19: line = 1;
20:* }
21: }
22:
23: return 0;
24: }
*a
: /* done */
: .
*w
more.c: 25 lines written
*q
Really quit (Y/N)? y
I added the /* done */
comment at the end to make my source file exactly 25 lines. That way, I can test the more program by viewing its source. It should display the full program, wait for a keystroke, then display the comment on the last line.
C:\SRC>bcc -ansi -o more.com more.c
C:\SRC>more < more.c
#include <stdio.h>
#include <conio.h>
int main()
{
int c;
int line=1,col=1;
while ((c=getchar()) != EOF) {
putchar(c);
if (c=='\n') { line++; col=1; }
else if (++col>80) { line++; col=1; }
if (line==25) {
fputs("--MORE--", stdout);
fflush(stdout);
if ((getch() & 255) == 27) { return 0; }
putchar('\n');
line = 1;
}
}
return 0;
}
--MORE--
/* done */
Getting to work
Open source is all about creating your own tools to do useful work. If a tool doesn’t exist to do the job you need to do, or if you need a slightly different tool to do a new job, you can make your own tools that work the way you need them to.
And that’s how I approached using this minimal DOS environment. The challenge I set myself was what tools do I need to build to do real work. When I write, I like to start the first draft in Markdown, so I needed to write a simple version of Markdown—or at least, a minimal markup system that was very Markdown-like.
A system like Markdown is nontrivial, so it requires some planning before writing any code. I decided to tackle this project by only implementing headings and paragraphs, plus bold and italic text. This keeps the program very simple and easy to write when using a line editor like Edlin.
At a high level, this “Markdown-like” processor reads one line at a time and transforms it according to a set of rules. Reading one line at a time makes this easier, at the expense of requiring a bit more memory and limiting some other features that I might add, but it was enough to reach my goal of writing a tool to do real work.
After reading a line, the program would evaluate it to determine what kind of line it is. A line that starts with one or more # is a heading, from # for Heading 1 (<h1>
) to ###### for Heading 6 (<h6>
). Otherwise, any line that contains text is a paragraph. The function eval_line
scans a line to determine whether or not it is empty; Non-empty lines are either headings or paragraphs:
typedef enum { NONE, H1, H2, H3, H4, H5, H6, PARA } tag_t;
tag_t eval_line()
{
/* determine what kind of text this is */
if (isempty_line()) { return NONE; }
if (Line[0] == '#') {
if (strncmp(Line, "# ", 2) == 0) { return H1; }
if (strncmp(Line, "## ", 3) == 0) { return H2; }
if (strncmp(Line, "### ", 4) == 0) { return H3; }
if (strncmp(Line, "#### ", 5) == 0) { return H4; }
if (strncmp(Line, "##### ", 6) == 0) { return H5; }
if (strncmp(Line, "###### ", 7) == 0) { return H6; }
}
return PARA; /* default */
}
I don’t usually write if statements like this, but when using a line editor like Edlin on an 80-column 25-line display, it helps to put single-line if-then statements on one line, so the program doesn’t require too many lines. That’s why I wrote instructions like:
if (strncmp(Line, "# ", 2) == 0) { return H1; }
…instead of using three lines to write the if-then as a block:
if (strncmp(Line, "# ", 2) == 0) {
return H1;
}
Blank lines indicate a break between paragraphs. Empty lines are detected using a new function called isempty_line
:
int isempty_line()
{
char *s;
s = Line;
while (s[0]) {
if (!isspace(s[0])) { return 0; /* not a space */ }
s++;
}
return 1; /* all spaces */
}
When printing the output, the program also tracks bold and italic text as inline formatting. The print_line
function uses *
to start and stop bold formatting, and _
to begin and end italic formatting. This is not actually how Markdown works (it’s closer to how Asciidoc formats text) but it was the easiest method to implement.
Writing this program using Edlin required careful coding and debugging, but I wrote the entire Markdown-like processor in about 170 lines:
/* simplified version of markdown */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
char *Line;
#define LINESIZE 100
typedef enum { NONE, H1, H2, H3, H4, H5, H6, PARA } tag_t;
typedef enum { false, true } BOOL;
struct { BOOL bold; BOOL ital; } format;
void start_block(tag_t tag, FILE *out)
{
switch (tag) {
case PARA: fputs("<p>", out); break;
case H1: fputs("<h1>", out); break;
case H2: fputs("<h2>", out); break;
case H3: fputs("<h3>", out); break;
case H4: fputs("<h4>", out); break;
case H5: fputs("<h5>", out); break;
case H6: fputs("<h6>", out); break;
}
}
void close_block(tag_t tag, FILE *out)
{
switch (tag) {
case PARA: fputs("</p>", out); break;
case H1: fputs("</h1>", out); break;
case H2: fputs("</h2>", out); break;
case H3: fputs("</h3>", out); break;
case H4: fputs("</h4>", out); break;
case H5: fputs("</h5>", out); break;
case H6: fputs("</h6>", out); break;
}
}
int isempty_line()
{
char *s;
s = Line;
while (s[0]) {
if (!isspace(s[0])) { return 0; /* not a space */ }
s++;
}
return 1; /* all spaces */
}
tag_t eval_line()
{
/* determine what kind of text this is */
if (isempty_line()) { return NONE; }
if (Line[0] == '#') {
if (strncmp(Line, "# ", 2) == 0) { return H1; }
if (strncmp(Line, "## ", 3) == 0) { return H2; }
if (strncmp(Line, "### ", 4) == 0) { return H3; }
if (strncmp(Line, "#### ", 5) == 0) { return H4; }
if (strncmp(Line, "##### ", 6) == 0) { return H5; }
if (strncmp(Line, "###### ", 7) == 0) { return H6; }
}
return PARA; /* default */
}
short text_offset(tag_t tag)
{
switch (tag) {
case H1: return 2;
case H2: return 3;
case H3: return 4;
case H4: return 5;
case H5: return 6;
case H6: return 7;
}
return 0;
}
void print_line(short offset, FILE *out)
{
char *s;
s = Line;
s += offset;
/* print <b> and <i> */
while (s[0]) {
switch (s[0]) {
case '*':
if (format.bold == false) {
fputs("<b>", out);
format.bold = true;
}
else {
fputs("</b>", out);
format.bold = false;
}
break;
case '_':
if (format.ital == false) {
fputs("<i>", out);
format.ital = true;
}
else {
fputs("</i>", out);
format.ital = false;
}
break;
default:
fputc(s[0], out);
}
s++;
}
}
void mkdown(FILE *in, FILE *out)
{
tag_t tag = NONE;
tag_t new_tag;
format.bold = false;
format.ital = false;
while (fgets(Line, LINESIZE, in)) {
new_tag = eval_line();
if (tag != new_tag) {
if (format.bold == true) {
fputs("</b>", out);
format.bold = false;
}
if (format.ital == true) {
fputs("</i>", out);
format.ital = false;
}
close_block(tag, out);
tag = new_tag;
start_block(tag, out);
}
print_line(text_offset(tag), out);
}
close_block(tag, out); /* done */
}
int main(int argc, char **argv)
{
Line = calloc(LINESIZE, sizeof(char));
if (Line == NULL) {
fputs("out of memory error\n", stderr);
return 1;
}
mkdown(stdin, stdout);
/* done */
free(Line);
return 0;
}
I saved this program as mkdn.c
and compiled it using BCC without errors:
C:\SRC>bcc -ansi -o mkdn.com mkdn.c
The mkdn
program reads from standard input and writes to standard output, which is enough to do real work using a simple Markdown-like processor:
C:\SRC>cat hello.md
# Heading 1
Hello world!
This is a test file to see if my simple implementation of
Markdown works as expected.
Files can use *bold* and _italic_ text.
## Heading 2
This is a subheading, which should use H2.
Here's another line.
And the end of file.
C:\SRC>mkdn < hello.md > hello.htm
C:\SRC>cat hello.htm
<h1>Heading 1
</h1>
<p>Hello world!
This is a test file to see if my simple implementation of
Markdown works as expected.
</p>
<p>Files can use <b>bold</b> and <i>italic</i> text.
</p>
<h2>Heading 2
</h2>
<p>This is a subheading, which should use H2.
Here's another line.
And the end of file.
</p>
This is correct HTML, although without the <html>
and <body>
tags that are needed to make this technically valid HTML. But the file displays correctly in a web browser, such as when I copied this file to my Linux system and viewed it in Firefox:

Bootstrapping your own tools
This was a demonstration of what you can do with a few simple tools. With only an editor and a compiler, you can create your own tools to do the work that you need to do. I created a simple file viewer and a more complicated Markdown-like text processor using only a line editor and a C compiler.
With more effort, I could have added more features to the Markdown-like system, such as support for block quotes and preformatted text. If I needed them, I could have written more interesting tools to help me with my work, including a program to check for typos in a text file.
Programming should be fun, but it should also be productive. The world is open to you by writing your own programs. Knowing a little programming can carry you a long way. That’s also the core of open source software, about how you can make your own programs and “scratch your own itch.”
More from We Love Open Source
- 5 FreeDOS editors I love
- How to write your first FreeDOS program
- A tale of two C compilers and their interfaces
- Explore the five steps of the FreeDOS boot sequence
- A throwback experiment with Linux and Unix
The opinions expressed on this website are those of each author, not of the author's employer or All Things Open/We Love Open Source.