B
B is an imperative, low-level, typeless, memory-unsafe, general-purpose obsolete programming language created by Ken Thompson and Dennis Ritchie at Bell Laboratories in 1969 as a direct descendant of BCPL. B was designed as a minimalist language for system implementation, particularly for early UNIX development on early computers (DEC PDP-7, later: DEC PDP-11). Its main purpose was to provide a portable, high-level alternative to assembly for writing operating system components and userspace utilities.
B is notable for being the immediate predecessor to C and for introducing a number of core concepts that became central to C and its descendants: pointer arithmetic, expression-based syntax, a compact set of control structures.
Interestingly, unlike most modern languages, B is completely typeless, which means all data is represented as machine words and there is no compile-time type checking. This makes B highly flexible and close to hardware, but also error-prone and difficult to debug, especially by today's standards.
B's standard library (libb.a) was minimal and tailored for system development, offering functions for string manipulation, input/output and filesystem operations, time tracking, process and TTY management, as well as permission handling.
Syntax
B's syntax is very barebones providing only the most simple and straight-forward operations for flow control, declarations, and arithmetic.
Comments
In B, only block comments /* ... */
exist. Inline comments // ...
are a feature of C99 and above.
Strings
String literal syntax in B notably differs from C:
- Strings in B are enclosed in double quotes (
"..."
). This distinguishes them from character literals that are enclosed in single quotes ('...'
). - A string literal consists of zero or more characters, including letters, digits, whitespace, and escape sequences.
- Internally, B appends a special end-of-string marker
'*e'
(ASCII EOT, value 0) to every string literal, which acts as a null terminator, signaling the end of the string in memory. - On the PDP-11 (16-bit word machine), B packs four characters into a single machine word, left-justified.
- Escape literals are notated as
*<?>
It is mainly the escape sequence syntax that differs, because B uses the asterisk (*
) character over the backlash (\
) character for noting escapes:
Escape Sequence | Represents |
---|---|
*0
|
NULL |
*e
|
end-of-file (automatically appended by the compiler) |
*(
|
{ (opening curly brace) |
*)
|
} (closing curly brace) |
*t
|
Horizontal tab character |
**
|
Asterisk character (*) |
*'
|
Single quote (') |
*"
|
Double quote (") |
* (asterisk + space)
|
A space character |
*n
|
A newline character. |
Other characters that are non-standard escape sequences are treated literally, for instance *w
would simply become w
.
Other important things to note when working with strings:
- External variables can be initialised with string literals directly:
extrn greeting "Hello, world*e";
- Since strings are vectors of characters, you can declare arrays and initialize them with string literals:
auto buffer[20]; buffer = "Hello, world*e"
; - You can access individual characters in a string using the subscript operator []:
buffer[0]
(first character) - Pointer arithmetic works the same on B strings as in C: incrementing character-by-character.
- Strings in B are single-line expressions and cannot be joined on multiple lines like in C. You can only use
*n
. This is because there is no pre-processing in B like there is in C.
Operators
Group | Operator(s) | Description |
---|---|---|
Unary | *
|
Dereference: Dereferences a pointer (indirection) |
Unary | &
|
Address of: Returns an address to a value. |
Unary | -
|
Negation: Negates the value, interpreting it as an integer. |
Unary | !
|
Logical NOT: Interprets the value as an integer and returns 0 if operand is non-zero and 1 if the operand is zero. |
Unary | ++ , --
|
Increment/decrement: May be used in prefix or postfix form, increments/decrements the value, and returns either the value before or after the operation (same as C behaviour) |
Multiplicative | *
|
Arithmetic multiplication. |
Multiplicative | /
|
Division. Truncated toward zero if operands are positive. Undefined otherwise. |
Multiplicative | %
|
Modulo. Valid if operands are positive. Undefined otherwise. |
Additive | + , -
|
Arithmetic add/subtract of two values. |
Binary shift | >> , <<
|
Vacated bits filled with zeros. Undefined for invalid shift counts. |
Relational | < , <= , > , >=
|
Checks the relation (less than, less than or equal to, greater than, greater than or equal to) of two values. |
Equality | == , !=
|
Checks if values are equal or not. |
Bitwise | | , &
|
Operands are treated as bit patterns; result is the bitwise AND or OR of both |
Ternary | a ? b : c
|
Same as if (a) { b } else { c }
|
Assignment | =+ , =- , =* , =/ , =% , =<< , =>> , =& , =|
|
Compound assignments. Equivalent to x = x op y .
|
Relational assignments | === , =!= , =< , =<= , => , =>=
|
Compound relational assignments. Equivalent to x = x op y .
|
Notes:
- Yes, it's
=+
not+=
, and alike. - Regarding non-standard relational assignments, an operator like
===
would meanx = (x == y)
.
Keywords
Keyword | Description |
---|---|
auto | Declares automatic (stack/local) variables |
extrn | Declares external (global) variables/functions |
if | Conditional statement |
else | Alternative branch for if
|
while | Loop statement |
switch | Multi-way branch (jump table) |
case | Label for switch statement |
goto | Unconditional jump |
return | Exit from function, optionally with a value |
Return
return
is a special keyword and it must always have parentheses:
return a + b; /* wrong */
return (a + b); /* correct */
Declarations
In B there are three types of declarations: automatic (auto
), external (extrn
), and internal.
Automatic declarations
Automatic declarations are simply stack-allocated local variables that are created every function call, denoted by the auto
keyword, just like in C. They can also be initialised as follows:
auto x, y, z; /* declare x, y, and z with undefined values */
auto meow 16; /* declare meow with default value 16 */
auto ari_lt 69, collab_vm 124; /* ari_lt = 69, collab_vm = 124 */
Automatic declarations must not use the equality operator.
External declarations
External declarations load external symbols from linked libraries before the program is even run, such as libb (B's standard library) into local scope. they cannot be initialised with any default value:
extrn printf; /* load printf libb functions */
extrn char, lchar; /* load both char and lchar libb functions */
...
It is important to note that all functions in B written in your B source are exported.
Internal declarations
Internal declarations are simply static local variables local to a function and is only available to that function, it is the default storage type in B, and is denoted by simply an identifier without any keyword:
name; /* static int name; */
Conditionals
The syntax is the same as in C:
if (condition) statement else ...;
// or
if (condition) {
statement;
statement;
...
} else {
...
}
switch
statements
Jump tables in B work the same as in C, except that there is no 'default
' label (just fall-through) and you have to simulate break
yourself.
before:
switch (...) {
case 1:
...;
goto before;
case 2:
...;
goto before;
...
}
while
loops
Loop syntax is exactly the same as in C:
There are no for loops. You can achieve a for loop by using a while loop:
auto x 0;
while (x < 10) {
printf("Hello*n");
++x;
}
/* Same as: */
for (int x = 0; x < 10; ++x) {
printf("Hello\n");
}
Curly braces are optional just like in conditionals.
Vectors (arrays)
Vectors in B have the following form:
name[size]{a, b, c, ...}
For instance:
auto nums[5] {1, 2, 3, 4, 5};
Vector operations work the same as in C, by using the [] operator to access zero-indexed elements:
nums[2]; /* returns the 3rd element */
Labels
Labels in B are a useful tool to be used with the goto keyword to unconditionally jump to a place. They are defined as follows:
some_label:
...
...
goto some_label; /* ^ */
Functions
In B, functions are defined using the following syntax:
function_name(parameter_list) {
/* Function body */
[return (value);] /* Optional, depending on the function */
}
Functions in B do not declare a return type and by convention, a function returns a value if a return statement is present; otherwise, it returns an undefined value. Function names are any valid identifiers. A parameter list is a comma-separated list of parameter names (no types). Parameters are always passed by value. If a function takes no arguments, the parameter list is left empty:
myfunc() {
/* ... */
}
In either case, a function is valid:
add(a, b) {
return (a + b);
}
extrn printf; /* So printf() is present */
hello() {
printf("Hello, world!*n");
}
Note that curly braces are optional, just like in while loops and conditionals, only if there is one statement.
main() function
the main() function is one of the two functions called in B implicitly, every program has a hidden sequence of:
main(); exit();
The main function, unlike in C, has no arguments, vector argv
(CLI arguments) and argc
(argv
size) are accessed through an external declaration:
extrn argv, argc;
Standard Library
The standard library of B is very simple:
Function | Prototype / Usage | Description | Return Value / Notes |
---|---|---|---|
char
|
c = char(string, i);
|
Returns the i-th character of the string | Character at position i
|
chdir
|
error = chdir(string);
|
Changes current directory to string
|
Negative on error |
chmod
|
error = chmod(string, mode);
|
Changes file mode (permissions) | Negative on error |
chown
|
error = chown(string, owner);
|
Changes file owner | Negative on error |
close
|
error = close(file);
|
Closes open file descriptor | Negative on error |
creat
|
file = creat(string, mode);
|
Creates or truncates file, opens for writing | Negative on error |
ctime
|
ctime(time, date);
|
Converts system time vector to human-readable date string | Fills 8-word vector date
|
execl
|
execl(string, arg0, arg1, ..., 0);
|
Replaces current process with executable, passing arguments | Returns on error |
execv
|
execv(string, argv, count);
|
Replaces current process with executable, passing argument vector | Returns on error |
exit
|
exit();
|
Terminates current process | Does not return |
fork
|
error = fork();
|
Creates child process | 0 in child, child's PID in parent, negative on error |
fstat
|
error = fstat(file, status);
|
Gets file status info into 20-word vector | Negative on error |
getchar
|
char = getchar();
|
Reads next character from standard input | Returns *e on EOF
|
getuid
|
id = getuid();
|
Returns user ID of current process | User ID |
gtty
|
error = gtty(file, ttystat);
|
Gets teletype modes into 3-word vector | Negative on error |
ichar
|
error = ichar(string, i, char);
|
Stores char into i-th character of string
|
Negative on error |
link
|
error = link(string1, string2);
|
Creates link string2 to existing file string1
|
Negative on error |
mkdir
|
error = mkdir(string, mode);
|
Creates directory with given mode | Negative on error |
open
|
file = open(string, mode);
|
Opens file for reading (mode=0) or writing (mode≠0) | File descriptor or negative on error |
printf
|
printf(format, arg1, ...);
|
Formatted output to standard output | — |
printn
|
printn(number, base);
|
Prints number in specified base | — |
putchar
|
putchar(char);
|
Writes character to standard output | — |
read
|
nread = read(file, buffer, count);
|
Reads count bytes into buffer from file
|
Number of bytes read or negative on error |
seek
|
error = seek(file, offset, pointer);
|
Sets file pointer relative to beginning (0), current (1), or end (2) | Negative on error |
setuid
|
error = setuid(id);
|
Sets user ID of current process | Negative on error |
stat
|
error = stat(string, status);
|
Gets file status info into 20-word vector | Negative on error |
stty
|
error = stty(file, ttystat);
|
Sets teletype modes from 3-word vector | Negative on error |
time
|
time(timev);
|
Gets current system time into 2-word vector | — |
unlink
|
error = unlink(string);
|
Removes link specified by string | Negative on error |
wait
|
error = wait();
|
Suspends process until child terminates; returns child's PID | Negative on error |
write
|
nwrite = write(file, buffer, count);
|
Writes count bytes from buffer to file
|
Number of bytes written or negative on error |
Note:
- The
argv
vector is predefined and contains the command-line arguments passed to the program.
- Many of these functions return negative values to indicate errors, following Unix conventions.
printf
andprintn
provide formatted and numeric output, respectively- File descriptors are integers returned by
open
,creat
, and used byread
,write
,close
, etc.
Compilers
- https://github.com/Spydr06/BCause - a compiler for the old B programming language for modern systems written in C and C++.
- https://github.com/tsoding/b - B Programming language compiler implemented in CRust.
- https://cpjsmith.uk/b - a B compiler from 2014 written in C.
- https://github.com/aap/abc - a B Compiler for x86 written in C and assembly.
- https://github.com/AlexCeleste/ybc - Yasha's B Compiler for x86 (32-bit).
- https://github.com/sergev/bcpl-compiler - B language compiler based on legacy BCPL sources.
Examples
Hello World Loop
extrn printf;
auto x 0;
while (x < 5) {
printf("Hello, world!*n");
++x;
}
Summing a Vector
extrn printf;
auto nums[5] {1, 2, 3, 4, 5};
auto idx 0, sum 0;
while (idx < 5) {
sum =+ nums[i];
++idx;
}
printf("Sum is %d*n", sum);
String Length
extrn printf;
strlen(s) {
auto idx 0;
while (s[idx] != '*e')
++idx;
return (idx);
}
auto mystr "B language!*e";
auto len;
len = strlen(mystr);
printf("Length: %d*n", len);