B

From Computernewb Wiki
Jump to navigation Jump to search

B is an imperative, low-level, typeless, memory-unsafe, general-purpose obsolete programming language created by Ken Thompson and Dennis Ritchie at Bell Laboratories in 1969 as a direct descendant of BCPL. B was designed as a minimalist language for system implementation, particularly for early UNIX development on early computers (DEC PDP-7, later: DEC PDP-11). Its main purpose was to provide a portable, high-level alternative to assembly for writing operating system components and userspace utilities.

B is notable for being the immediate predecessor to C and for introducing a number of core concepts that became central to C and its descendants: pointer arithmetic, expression-based syntax, a compact set of control structures.

Interestingly, unlike most modern languages, B is completely typeless, which means all data is represented as machine words and there is no compile-time type checking. This makes B highly flexible and close to hardware, but also error-prone and difficult to debug, especially by today's standards.

B's standard library (libb.a) was minimal and tailored for system development, offering functions for string manipulation, input/output and filesystem operations, time tracking, process and TTY management, as well as permission handling.

Syntax

B's syntax is very barebones providing only the most simple and straight-forward operations for flow control, declarations, and arithmetic.

Comments

In B, only block comments /* ... */ exist. Inline comments // ... are a feature of C99 and above.

Strings

String literal syntax in B notably differs from C:

  • Strings in B are enclosed in double quotes ("..."). This distinguishes them from character literals that are enclosed in single quotes ('...').
  • A string literal consists of zero or more characters, including letters, digits, whitespace, and escape sequences.
  • Internally, B appends a special end-of-string marker '*e' (ASCII EOT, value 0) to every string literal, which acts as a null terminator, signaling the end of the string in memory.
  • On the PDP-11 (16-bit word machine), B packs four characters into a single machine word, left-justified.
  • Escape literals are notated as *<?>

It is mainly the escape sequence syntax that differs, because B uses the asterisk (*) character over the backlash (\) character for noting escapes:

Escape Sequence Represents
*0 NULL
*e end-of-file (automatically appended by the compiler)
*( { (opening curly brace)
*) } (closing curly brace)
*t Horizontal tab character
** Asterisk character (*)
*' Single quote (')
*" Double quote (")
* (asterisk + space) A space character
*n A newline character.

Other characters that are non-standard escape sequences are treated literally, for instance *w would simply become w.

Other important things to note when working with strings:

  • External variables can be initialised with string literals directly: extrn greeting "Hello, world*e";
  • Since strings are vectors of characters, you can declare arrays and initialize them with string literals: auto buffer[20]; buffer = "Hello, world*e";
  • You can access individual characters in a string using the subscript operator []: buffer[0] (first character)
  • Pointer arithmetic works the same on B strings as in C: incrementing character-by-character.
  • Strings in B are single-line expressions and cannot be joined on multiple lines like in C. You can only use *n. This is because there is no pre-processing in B like there is in C.

Operators

Group Operator(s) Description
Unary * Dereference: Dereferences a pointer (indirection)
Unary & Address of: Returns an address to a value.
Unary - Negation: Negates the value, interpreting it as an integer.
Unary ! Logical NOT: Interprets the value as an integer and returns 0 if operand is non-zero and 1 if the operand is zero.
Unary ++, -- Increment/decrement: May be used in prefix or postfix form, increments/decrements the value, and returns either the value before or after the operation (same as C behaviour)
Multiplicative * Arithmetic multiplication.
Multiplicative / Division. Truncated toward zero if operands are positive. Undefined otherwise.
Multiplicative % Modulo. Valid if operands are positive. Undefined otherwise.
Additive +, - Arithmetic add/subtract of two values.
Binary shift >>, << Vacated bits filled with zeros. Undefined for invalid shift counts.
Relational <, <=, >, >= Checks the relation (less than, less than or equal to, greater than, greater than or equal to) of two values.
Equality ==, != Checks if values are equal or not.
Bitwise |, & Operands are treated as bit patterns; result is the bitwise AND or OR of both
Ternary a ? b : c Same as if (a) { b } else { c }
Assignment =+, =-, =*, =/, =%, =<<, =>>, =&, =| Compound assignments. Equivalent to x = x op y.
Relational assignments ===, =!=, =<, =<=, =>, =>= Compound relational assignments. Equivalent to x = x op y.

Notes:

  • Yes, it's =+ not +=, and alike.
  • Regarding non-standard relational assignments, an operator like === would mean x = (x == y).

Keywords

Keyword Description
auto Declares automatic (stack/local) variables
extrn Declares external (global) variables/functions
if Conditional statement
else Alternative branch for if
while Loop statement
switch Multi-way branch (jump table)
case Label for switch statement
goto Unconditional jump
return Exit from function, optionally with a value
Return

return is a special keyword and it must always have parentheses:

return a + b; /* wrong */
return (a + b); /* correct */

Declarations

In B there are three types of declarations: automatic (auto), external (extrn), and internal.

Automatic declarations

Automatic declarations are simply stack-allocated local variables that are created every function call, denoted by the auto keyword, just like in C. They can also be initialised as follows:

auto x, y, z; /* declare x, y, and z with undefined values */
auto meow 16; /* declare meow with default value 16 */
auto ari_lt 69, collab_vm 124; /* ari_lt = 69, collab_vm = 124 */

Automatic declarations must not use the equality operator.

External declarations

External declarations load external symbols from linked libraries before the program is even run, such as libb (B's standard library) into local scope. they cannot be initialised with any default value:

extrn printf; /* load printf libb functions */
extrn char, lchar; /* load both char and lchar libb functions */
...

It is important to note that all functions in B written in your B source are exported.

Internal declarations

Internal declarations are simply static local variables local to a function and is only available to that function, it is the default storage type in B, and is denoted by simply an identifier without any keyword:

name; /* static int name; */

Conditionals

The syntax is the same as in C:

if (condition) statement else ...;

// or

if (condition) {
    statement;
    statement;
    ...
} else {
    ...
}

switch statements

Jump tables in B work the same as in C, except that there is no 'default' label (just fall-through) and you have to simulate break yourself.

before:
switch (...) {
    case 1:
        ...;
        goto before;
    case 2:
        ...;
        goto before;
    ...
}

while loops

Loop syntax is exactly the same as in C:

There are no for loops. You can achieve a for loop by using a while loop:

auto x 0;

while (x < 10) {
    printf("Hello*n");
    ++x;
}

/* Same as: */

for (int x = 0; x < 10; ++x) {
    printf("Hello\n");
}

Curly braces are optional just like in conditionals.

Vectors (arrays)

Vectors in B have the following form:

name[size]{a, b, c, ...}

For instance:

auto nums[5] {1, 2, 3, 4, 5};

Vector operations work the same as in C, by using the [] operator to access zero-indexed elements:

nums[2]; /* returns the 3rd element */

Labels

Labels in B are a useful tool to be used with the goto keyword to unconditionally jump to a place. They are defined as follows:

some_label:
...
...
goto some_label; /* ^ */

Functions

In B, functions are defined using the following syntax:

function_name(parameter_list) {
    /* Function body */
    [return (value);] /* Optional, depending on the function */
}

Functions in B do not declare a return type and by convention, a function returns a value if a return statement is present; otherwise, it returns an undefined value. Function names are any valid identifiers. A parameter list is a comma-separated list of parameter names (no types). Parameters are always passed by value. If a function takes no arguments, the parameter list is left empty:

myfunc() {
    /* ... */
}

In either case, a function is valid:

add(a, b) {
    return (a + b);
}

extrn printf; /* So printf() is present */

hello() {
    printf("Hello, world!*n");
}

Note that curly braces are optional, just like in while loops and conditionals, only if there is one statement.

main() function

the main() function is one of the two functions called in B implicitly, every program has a hidden sequence of:

main(); exit();

The main function, unlike in C, has no arguments, vector argv (CLI arguments) and argc (argv size) are accessed through an external declaration:

extrn argv, argc;

Standard Library

The standard library of B is very simple:

Function Prototype / Usage Description Return Value / Notes
char c = char(string, i); Returns the i-th character of the string Character at position i
chdir error = chdir(string); Changes current directory to string Negative on error
chmod error = chmod(string, mode); Changes file mode (permissions) Negative on error
chown error = chown(string, owner); Changes file owner Negative on error
close error = close(file); Closes open file descriptor Negative on error
creat file = creat(string, mode); Creates or truncates file, opens for writing Negative on error
ctime ctime(time, date); Converts system time vector to human-readable date string Fills 8-word vector date
execl execl(string, arg0, arg1, ..., 0); Replaces current process with executable, passing arguments Returns on error
execv execv(string, argv, count); Replaces current process with executable, passing argument vector Returns on error
exit exit(); Terminates current process Does not return
fork error = fork(); Creates child process 0 in child, child's PID in parent, negative on error
fstat error = fstat(file, status); Gets file status info into 20-word vector Negative on error
getchar char = getchar(); Reads next character from standard input Returns *e on EOF
getuid id = getuid(); Returns user ID of current process User ID
gtty error = gtty(file, ttystat); Gets teletype modes into 3-word vector Negative on error
ichar error = ichar(string, i, char); Stores char into i-th character of string Negative on error
link error = link(string1, string2); Creates link string2 to existing file string1 Negative on error
mkdir error = mkdir(string, mode); Creates directory with given mode Negative on error
open file = open(string, mode); Opens file for reading (mode=0) or writing (mode≠0) File descriptor or negative on error
printf printf(format, arg1, ...); Formatted output to standard output
printn printn(number, base); Prints number in specified base
putchar putchar(char); Writes character to standard output
read nread = read(file, buffer, count); Reads count bytes into buffer from file Number of bytes read or negative on error
seek error = seek(file, offset, pointer); Sets file pointer relative to beginning (0), current (1), or end (2) Negative on error
setuid error = setuid(id); Sets user ID of current process Negative on error
stat error = stat(string, status); Gets file status info into 20-word vector Negative on error
stty error = stty(file, ttystat); Sets teletype modes from 3-word vector Negative on error
time time(timev); Gets current system time into 2-word vector
unlink error = unlink(string); Removes link specified by string Negative on error
wait error = wait(); Suspends process until child terminates; returns child's PID Negative on error
write nwrite = write(file, buffer, count); Writes count bytes from buffer to file Number of bytes written or negative on error

Note:

  • The argv vector is predefined and contains the command-line arguments passed to the program.
  • Many of these functions return negative values to indicate errors, following Unix conventions.
  • printf and printn provide formatted and numeric output, respectively
  • File descriptors are integers returned by open, creat, and used by read, write, close, etc.

Compilers

Examples

Hello World Loop

extrn printf;

auto x 0;

while (x < 5) {
    printf("Hello, world!*n");
    ++x;
}

Summing a Vector

extrn printf;

auto nums[5] {1, 2, 3, 4, 5};
auto idx 0, sum 0;

while (idx < 5) {
    sum =+ nums[i];
    ++idx;
}

printf("Sum is %d*n", sum);

String Length

extrn printf;

strlen(s) {
    auto idx 0;
    while (s[idx] != '*e')
        ++idx;
    return (idx);
}

auto mystr "B language!*e";
auto len;

len = strlen(mystr);
printf("Length: %d*n", len);

Sources