The worst mistake of computer science

Uglier than a Windows backslash, odder than ===, more common than PHP, more unfortunate than CORS, more disappointing than Java generics, more inconsistent than XMLHttpRequest, more confusing than a C preprocessor, flakier than MongoDB, and more regrettable than UTF-16, the worst mistake in computer science was introduced in 1965.

Homer doh!

I call it my billion-dollar mistake…At that time, I was designing the first comprehensive type system for references in an object-oriented language. My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
– Tony Hoare, inventor of ALGOL W.

In commemoration of the 50th anniversary of Sir Hoare’s null, this article explains what null is, why it is so terrible, and how to avoid it.

What is wrong with NULL?

The short answer: NULL is a value that is not a value. And that’s a problem.

It has festered in the most popular languages of all time and is now known by many names: NULL, nil, null, None, Nothing, Nil, nullptr. Each language has its own nuances.

Some of the problems caused by NULL apply only to a particular language, while others are universal; a few are simply different facets of a single issue.

NULL…

  1. subverts types
  2. is sloppy
  3. is a special case
  4. makes poor APIs
  5. exacerbates poor language decisions
  6. is difficult to debug
  7. is non-composable

1. NULL subverts types

Statically typed languages check the uses of types in the program without actually executing, providing certain guarantees about program behavior.

For example, in Java, if I write x.toUppercase(), the compiler will inspect the type of x. If x is known to be a String, the type check succeeds; if x is known to be a Socket, the type check fails.

Static type checking is a powerful aid in writing large, complex software. But for Java, these wonderful compile-time checks suffer from a fatal flaw: any reference can be null, and calling a method on null produces a NullPointerException. Thus,

  • toUppercase() can be safely called on any String…unless the String is null.
  • read() can be called on any InputStream…unless the InputStream is null.
  • toString() can be called on any Object…unless the Object is null.

Java is not the only culprit; many other type systems have the same flaw, including of course, AGOL W.

In these languges, NULL is above type checks. It slips through them silently, waiting for runtime, to finally burst free in a shower of errors. NULL is the nothing that is simultaneously everything.

2. NULL is sloppy

There are many times when it doesn’t make sense to have a null. Unfortunately, if the language permits anything to be null, well, anything can be null.

Java programmers risk carpal tunnel from writing

if (str == null || str.equals("")) {
}

It’s such a common idiom that C# adds String.IsNullOrEmpty

if (string.IsNullOrEmpty(str)) {
}

Abhorrent.

Every time you write code that conflates null strings and empty strings, the Guava team weeps.
Google Guava

Well said. But when your type system (e.g. Java, or C#) allows NULL everywhere, you cannot reliably exclude the possibility of NULL, and it’s nearly inevitable it will wind up conflated somewhere.

The ubiquitous possibility of null posed such a problem that Java 8 added the @NonNull annotation to try to retroactively fix this flaw in its type system.

3. NULL is a special-case

Given that NULL functions as a value that is not a value, NULL naturally becomes the subject of various forms of special treatment.

Pointers

For example, consider this C++:

char c = 'A';
char *myChar = &c;
std::cout << *myChar << std::endl;

myChar is a char *, meaning that it is a pointer—i.e. the memory address—to a char. The compiler verifies this. Therefore, the following is invalid:

char *myChar = 123; // compile error
std::cout << *myChar << std::endl;

Since 123 is not guaranteed to be the address of a char, compilation fails. However, if we change the number to 0 (which is NULL in C++), the compiler passes it:

char *myChar = 0;
std::cout << *myChar << std::endl; // runtime error

As with 123, NULL is not actually the address of a char. Yet this time the compiler permits it, because 0 (NULL) is a special case.

Strings

Yet another special case happens with C’s null-terminated strings. This is a bit different than the other examples, as there are no pointers or references. But the idea of a value that is not a value is still present, in the form of a char that is not a char.

A C-string is a sequence of bytes, whose end is marked by the NUL (0) byte.

 76 117  99 105 100  32  83 111 102 116 119  97 114 101  0
 L   u   c   i   d       S   o   f   t   w   a   r   e  NUL

Thus, each character of a C-string can be any of the possible 256 bytes, except 0 (the NUL character). Not only does this make string length a linear-time operation; even worse, it means that C-strings cannot be used for ASCII or extended ASCII. Instead, they can only be used for the unusual ASCIIZ.

This exception for a singular NUL character has caused innumerable errors: API weirdness, security vulnerabilities, and buffer overflows.

NULL is the worst CS mistake; more specifically, NUL-terminated strings are the most expensive one-byte mistakes.

4. NULL makes poor APIs

For the next example, we will journey to the land of dynamically-typed languages, where NULL will again prove to be a terrible mistake.

Key-value store

Suppose we create a Ruby class that acts as a key-value store. This may be a cache, an interface for a key-value database, etc. We’ll make the general-purpose API simple:

class Store
    ##
    # associate key with value
    # 
    def set(key, value)
        ...
    end

    ##
    # get value associated with key, or return nil if there is no such key
    #
    def get(key)
        ...
    end
end

We can imagine an analog in many languages (Python, JavaScript, Java, C#, etc.).

Now suppose our program has a slow or resource-intensive way of finding out someone’s phone number—perhaps by contacting a web service.

To improve performance, we’ll use a local Store as a cache, mapping a person’s name to his phone number.

store = Store.new()
store.set('Bob', '801-555-5555')
store.get('Bob') # returns '801-555-5555', which is Bob’s number
store.get('Alice') # returns nil, since it does not have Alice

However, some people won’t have phone numbers (i.e. their phone number is nil). We’ll still cache that information, so we don’t have to repopulate it later.

store = Store.new()
store.set('Ted', nil) # Ted has no phone number
store.get('Ted') # returns nil, since Ted does not have a phone number

But now the meaning of our result is ambiguous! It could mean:

  1. the person does not exist in the cache (Alice)
  2. the person exists in the cache and does not have a phone number (Tom)

One circumstance requires an expensive recomputation, the other an instantaneous answer. But our code is insufficiently sophisticated to distinguish between these two.

In real code, situations like this come up frequently, in complex and subtle ways. Thus, simple, generic APIs can suddenly become special-cased, confusing sources of sloppy nullish behavior.

Patching the Store class with a contains() method might help. But this introduces redundant lookups, causing reduced performance, and race conditions.

Double trouble

JavaScript has this same issue, but with every single object.
If a property of an object doesn’t exist, JS returns a value to indicate the absence. The designers of JavaScript could have chosen this value to be null.

But instead they worried about cases where the property exists and is set to the value null. In a stroke of ungenius, JavaScript added undefined to distinguish a null property from a non-existent one.

But what if the property exists, and is set to the value undefined? Oddly, JavaScript stops here, and there is no uberundefined.

Thus JavaScript wound up with not only one, but two forms of NULL.

5. NULL exacerbates poor language decisions

Java silently converts between reference and primitive types. Add in null, and things get even weirder.

For example, this does not compile:

int x = null; // compile error

This does compile:

Integer i = null;
int x = i; // runtime error

though it throws a NullPointerException when run.

It’s bad enough that member methods can be called on null; it’s even worse when you never even see the method being called.

6. NULL is difficult to debug

C++ is a great example of how troublesome NULL can be. Calling member functions on a NULL pointer won’t necessarily crash the program. It’s much worse: it might crash the program.

#include <iostream>
struct Foo {
    int x;
    void bar() {
        std::cout << "La la la" << std::endl;
    }
    void baz() {
        std::cout << x << std::endl;
    }
};
int main() {
    Foo *foo = NULL;
    foo->bar(); // okay
    foo->baz(); // crash
}

When I compile this with gcc, the first call succeeds; the second call fails.

Why? foo->bar() is known at compile-time, so the compiler avoids a runtime vtable lookup, and transforms it to a static call like Foo_bar(foo), with this as the first argument. Since bar doesn’t dereference that NULL pointer, it succeeds. But baz does, which causes a segmentation fault.

But suppose instead we had made bar virtual. This means that its implementation may be overridden by a subclass.

    ...
    virtual void bar() {
    ...

As a virtual function, foo->bar() does a vtable lookup for the runtime type of foo, in case bar() has been overridden. Since foo is NULL, the program now crashes at foo->bar() instead, all because we made a function virtual.

int main() {
    Foo *foo = NULL;
    foo->bar(); // crash
    foo->baz();
}

NULL has made debugging this code extraordinarily difficult and unintuitive for the programmer of main.

Granted, dereferencing NULL is undefined by the C++ standard, so technically we shouldn’t be surprised by whatever happened. Still, this is a non-pathological, common, very simple, real-world example of one of the many ways NULL can be capricious in practice.

7. NULL is non-composable

Programming languages are built around composability: the ability to apply one abstraction to another abstraction. This is perhaps the single most important feature of any language, library, framework, paradigm, API, or design pattern: the ability to be used orthogonally with other features.

In fact, composibility is really the fundamental issue behind many of these problems. For example, the Store API returning nil for non-existant values was not composable with storing nil for non-existant phone numbers.

C# addresses some problems of NULL with Nullable<T>. You can include the optionality (nullability) in the type.

int a = 1;     // integer
int? b = 2;    // optional integer that exists
int? c = null; // optional integer that does not exist

But it suffers from a critical flaw that Nullable<T> cannot apply to any T. It can only apply to non-nullable T. For example, it doesn’t make the Store problem any better.

  1. string is nullable to begin with; you cannot make a non-nullable string
  2. Even if string were non-nullable, thus making string? possible, you still wouldn’t be able to disambiguate the situation. There isn’t a string??

The solution

NULL has become so pervasive that many just assume that it’s necessary. We’ve had it for so long in so many low- and high-level languages, it seems essential, like integer arithmetic or I/O.

Not so! You can have an entire programming language without NULL. The problem with NULL is that it is a non-value value, a sentinel, a special case that was lumped in with everything else.

Instead, we need an entity that contains information about (1) whether it contains a value and (2) the contained value, if it exists. And it should be able to “contain” any type. This is the idea of Haskell’s Maybe, Java’s Optional, Swift’s Optional, etc.

For example, in Scala, Some[T] holds a value of type T. None holds no value. These are the two subtypes of Option[T], which may or may not hold a value.

Option[T], Some[T], None

The reader unfamiliar with Maybes/Options may think we have substituted one form of absence (NULL) for another form of absence (None). But there is a difference — subtle, but crucially important.

In a statically typed language, you cannot bypass the type system by substituting a None for any value. A None can only be used where we expected an Option. Optionality is explicitly represented in the type.

And in dynamically typed languages, you cannot confuse the usage of Maybes/Options and the contained values.

Let’s revisit the earlier Store, but this time using ruby-possibly. The Store class returns Some with the value if it exists, and a None if it does not. And for phone numbers, Some is for a phone number, and None is for no phone number. Thus there are two levels of existence/non-existence: the outer Maybe indicates presence in the Store; the inner Maybe indicates the presence of the phone number for that name. We have successfully composed the Maybes, something we could not do with nil.

cache = Store.new()
cache.set('Bob', Some('801-555-5555'))
cache.set('Tom', None())

bob_phone = cache.get('Bob')
bob_phone.is_some # true, Bob is in cache
bob_phone.get.is_some # true, Bob has a phone number
bob_phone.get.get # '801-555-5555'

alice_phone = cache.get('Alice')
alice_phone.is_some # false, Alice is not in cache

tom_phone = cache.get('Tom')
tom_phone.is_some # true, Tom is in cache
tom_phone.get.is_some #false, Tom does not have a phone number

The essential difference is that there is no more union–statically typed or dynamically assumed–between NULL and every other type, no more nonsensical union between a present value and an absence.

Manipulating Maybes/Options

Let’s continue with more examples of non-NULL code. Suppose in Java 8+, we have an integer that may or may not exist, and if it does exist, we print it.

Optional<Integer> option = ...
if (option.isPresent()) {
   doubled = System.out.println(option.get());
}

This is good. But most Maybe/Optional implementations, including Java’s, support an even better functional approach:

option.ifPresent(x -> System.out.println(x));
// or option.ifPresent(System.out::println)

Not only is this functional way more succinct, but it is also a little safer. Remember that option.get() will produce an error if the value is not present. In the earlier example, the get() was guarded by an if. In this example, ifPresent() obviates our need for get() at all. It makes there obviously be no bug, rather than no obvious bugs.

Options can be thought about as a collection with a max size of 1. For example, we can double the value if it exists, or leave it empty otherwise.

option.map(x -> 2 * x)

We can optionally perform an operation that returns an optional value, and “flatten” the result.

option.flatMap(x -> methodReturningOptional(x))

We can provide a default value if none exists:

option.orElseGet(5)

In summary, the real value of Maybe/Option is

  1. reducing unsafe assumptions about what values “exist” and which do not
  2. making it easy to safely operate on optional data
  3. explicitly declaring any unsafe existence assumptions (e.g. with an .get() method)

Down with NULL!

NULL is a terrible design flaw, one that continues to cause constant, immeasurable pain. Only a few languages have managed to avoid its terror.

If you do choose a language with NULL, at least possess the wisdom to avoid such awfulness in your own code and use the equivalent Maybe/Option.

NULL in common languages:

Language NULL Maybe NULL Score
C NULL
C++ NULL boost::optional, from Boost.Optional
C# null
Clojure nil java.lang.Optional
Common Lisp nil maybe, from cl-monad-macros
F# null Core.Option
Go nil
Groovy null java.lang.Optional
Haskell Maybe
Java null java.lang.Optional
JavaScript (ECMAScript) null, undefined Maybe, from npm maybe
Objective C nil, Nil, NULL, NSNull Maybe, from SVMaybe
OCaml option
Perl undef
PHP NULL Maybe, from monad-php
Python None Maybe, from PyMonad
Ruby nil Maybe, from ruby-possibly
Rust Option
Scala null scala.Option
Standard ML option
Swift Optional
Visual Basic Nothing

“Scores” are according to:

Does not have NULL.
Has NULL. Has an alternative in the language or standard libraries.
Has NULL. Has an alternative in a community library.
Has NULL.
Programmer’s worst nightmare. Multiple NULLs.

Edits

Ratings

Don’t take the “ratings” too seriously. The real point is to summarize the state of NULL in various languages and show alternatives to NULL, not to rank languages generally.

The info for a few languages has been corrected. Some languages have some sort of null pointer for compatibility reasons with runtimes, but they aren’t really usable in the language itself.

  • Example: Haskell’s Foreign.Ptr.nullPtr is used for FFI (Foreign Function Interface), for marshalling values to and from Haskell.
  • Example: Swift’s UnsafePointer must be used with unsafeUnwrap or !.
  • Counter-example: Scala, while idiomatically avoiding null, still treats null the same as Java, for increased interop. val x: String = null

When is NULL okay

It deserves mentions that a special value of the same size, like 0 or NULL can be useful when cutting CPU cycles, trading code quality for performance. This is handy for those low-level languages, like C, when it really matters, but it really should be left there.

The REAL problem

The more general issue of NULL is that of sentinel values: values that are treated the same as others, but which have entirely different semantics. Returning either an integer index or the integer -1 from indexOf is a good example. NUL-terminated strings is another. This post focuses mostly on NULL, given its ubiquity and real-world effects, but just as Sauron is a mere servent of Morgoth, so too is NULL a mere manifestation of the underlying problem of sentinels.

Interested in learning more about our software? Sign up for Lucidchart free here. Want to join our team? Check out our careers page.

95 Comments

  1. It sounds like the problem is not with NULL But with the C/C++/Java-style implementation of it.

    For example, Common Lisp has NIL (not “nil”), which you call CL’s NULL value, but it’s very different from the similarly named C/C++/Java concept. #3 and #5 don’t even make sense in CL. I don’t think #6 or #7 apply, either. #1 and #2 don’t really mean much, either, since CL doesn’t make any promises about types, unless you explicitly assert them (in which case you can easily assert non-NIL-ness, too).

    What’s left? “NULL makes poor APIs.” Hmm, maybe. As a Lisp programmer, though, it doesn’t seem so bad, since none of those other problems really exist for us. Besides, we have multiple return values, which helps avoid many of the API problems you cite. Just as NULL “exacerbates poor language decisions”, its trouble can be muted by *good* language design decisions.

    In philosophical terms, NULL is a weird case, but in practical terms, it’s not really a problem for me (as long as I stay outside of the C/C++/Java world). On a daily basis, I’m much more troubled that, say, exception objects aren’t class instances, or that defstruct pollutes my namespace.

  2. In the Java section,

    if (str == null || str == “”) {
    }

    doesn’t do what you think it does. In fact, I don’t think that second clause can ever be true b/c it’s testing if it’s the same object as the empty string object you’re just now instantiating inline. Needs to be:

    if (str == null || str.equals(“”)) {

    }

    But the “str == null” does what you want! Which I guess proves the main point about null muddying the waters yet again. (In Java, null works for physical equality in places where you’re trying to use equality by value.)

  3. Brendan ZabarauskasAugust 31, 2015 at 3:31 pm

    Great article, but by your standards, Haskell should also be four stars, because it has Foreign.Ptr.nullPtr, which is basically like Rust’s std::ptr::null, and basically just used for FFI bindings. So either Rust should be 5 stars, or Haskell should be 4.

  4. I can’t say for certain about the other languages, but the Python “None” value is NOT the same as null. None has a value (None), it has a type (NoneType), it can be introspected, it can be composed, and so on.

    Of course, it doesn’t do very much, but it’s semantically similar to lots of other small classes that don’t do much.

  5. Justin Heyes-JonesAugust 31, 2015 at 3:53 pm

    Nice exposition.

    BTW there’s a typo here:

    “Options can be regardless as a collection”

    Should be “regarded” I think.

    J

  6. Rust should get 5 stars since you have to use `unsafe` in order to use `std::ptr::null`, and in general it is discouraged to use `unsafe` if you’re not interacting with other C APIs.

  7. Paul DraperAugust 31, 2015 at 4:21 pm

    Peter, Brendan, thanks for the corrections.
    It’s been fixed 🙂

  8. I think you’re being too harsh on Objective-C there. nil, Nil, and NULL are all the same null (differentiated solely for contextual readability), and NSNull isn’t a null at all. Messages to nil/Nil/NULL are no-ops if they don’t return anything, and return nil/Nil/NULL, 0, 0.0(f) or zeroed structs if they do.

    Behaviorally, NSNull is more similar to classical trapping nulls (you get a runtime exception trying to message it anything it doesn’t respond to, which is better defined than in C), but it’s used where a nil/Nil/NULL can’t be used. In the Store example above, Bob’s phone number could be set to [NSNull null] rather than nil. And it is composable in the other direction that you can have an NSNull property whose value is nil. Given that Objective-C is a dynamic language that barely pretends to be static (replace all your object pointer types by id, and everything will still compile and run fine, with maybe a few more warnings), that’s pretty good.

  9. Kotlin and Ceylon are some neat looking up and coming languages that handle empty values in a nice way – possibly worth adding to your list at the end there to highlight good alternate approaches.

  10. Go’s nil is somewhat different to C-style NULL: it is the zero value for certain types — it is to these types as 0 is to int and “” is to string.

    As such, it has none of the issues described in the article and is a different beast entirely.

  11. Joe SwinbankAugust 31, 2015 at 6:15 pm

    “Optionals” are already built into C:
    void f( int required, int* optional ).

  12. > Rust should get 5 stars since you have to use `unsafe` in order to use `std::ptr::null`, and in general it is discouraged to use `unsafe` if you’re not interacting with other C APIs.

    Swift should get 5 stars too, for the same reasons. UnsafePointer is only for C interop, and native Swift types aren’t nullable.

  13. The worst mistake in computer science has been the use of “Foo” and “Bar” as example classes/methods instead of a more natural pattern.

  14. Great post — just a small nitpick. The more idiomatic way to write C# is:


    if (String.IsNullOrEmpty(str)) {
    }

  15. This article and its list of languages is woefully incomplete without a discussion or sub-discussion of SQL and its use of NULLs and the intersection of that with the languages discussed.

  16. I think, to be fair, that C# should get a 3 stars, considering there is librairies out there, that allow you to emulate an equivalent of “Maybe” to avoid the null value in your programs 🙂

    Regardless, great article.

  17. Thanks for the interesting article! You caused me to add two new words to 8th (http://8th-dev.com) to check for existence of array or object keys…

  18. Frode NilsenSeptember 1, 2015 at 3:29 am

    C# should have 4 stars with the Nullable operator https://msdn.microsoft.com/en-us/library/2cf62fcy.aspx

    Also, in C# 6 you can avoid NullReferenceExceptions elegantly with the null propagation operator http://davefancher.com/2014/08/14/c-6-0-null-propagation-operator/

    This makes C# one of the better languages to work with NULLs in

  19. Peter: you’re wrong about that string comparison. String literals in java are automatically interned, so eg “”==”” will work – you don’t get two separate instances. str==”” will work if str is an interned string (ie it was explicitly interned, or was assigned from a literal). JLS 3.10.5:

    Moreover, a string literal always refers to the same instance of class String. This is because string literals – or, more generally, strings that are the values of constant expressions (§15.28) – are “interned” so as to share unique instances, using the method String.intern.

  20. C# should have 4 stars as it has support for your proposed solution (since .NET version 2.0… which came out in 2005) via the Nullable struct.

  21. Mohit KeswaniSeptember 1, 2015 at 12:45 pm

    Great article and how Option or Guava library can solve it.

  22. The title is misleading to attract traffic. NULL has nothing to do with computer science.

  23. That intro would have killed my ability to read the whole article except each pain point cut deeper (and was more obscure) than that last. How long did you sort those until you got the perfect order?

    If anyone is up to the challenge of rewriting the fundamentals AND having them adopted sans NULL, it would have to be Paul. 🙂

    Just the kind of read I would expect coming out of Lucid’s dev team.

  24. “confusing sources of sloppy nullish behavior” – my favorite

  25. The problem is that null will never go away as an implementation detail, at least in my world of high performance coding. In C/C++, there’s no good way to implement optional types without RTTI (which is a terrible idea) or carrying around extra state in a tagged union or special type. With null pointers, an address is the same size as a register and can be tested with simple instructions. The need for simplicity only gets worse as you move down to assembly.

    Although that being said, we still use things that look awfully like optional types in high level APIs. In a project I’m working on, expected errors (we dont use exceptions) get returned in a struct that looks like this:

    struct RetVal
    {
    enum Error { None, ResourceNotFound, etc };
    Ret* val;
    }

    It just happens to be in a form more suited to actually execute on computers.

  26. Omar De La RosaSeptember 2, 2015 at 7:33 am

    Excelent Post.

  27. Philip OakleySeptember 2, 2015 at 7:54 am

    Null in this context has the same problem as infinity does for mathematics (e.g. the set of integers (0, 1, …, N, N+1, inf).

    It’s a special case that shouldn’t exist based on one set of rules, but is felt essential based on a wider rules.

    Null is just the limit of 1/N as N tends to infinity; it’s sort of zero, but not actually zero. (see http://web.maths.unsw.edu.au/~norman/papers/SetTheory.pdf for the mathematicians side of the coin)

  28. FYI NULL in “C” is not a value but a character it’s from the ASCII character set. When you hit the “NULL” key the terminal sends:
    START BIT
    THE VALUE 0
    STOP BIT

    In comparison if you hit the “ZERO” key the terminal sends:
    START BIT
    THE VALUE 48
    STOP BIT

    Because the NULL character has a Boolean value of false it is very easy to detect the end of the string:
    if(!string(x)) // indicate end of string

    also CR(\r) and LF(\n) are often used as string terminator they are not. They can be used anywhere withing a string they are simply control for the display or printer to move the cursor. { note: the \n should read “next line” and not “new line”}

    In “C” the compiler use the “;” as the end of string , this allows multiple string: per physical line:
    ex- a=1; b=3; c=a+b; if(c==3) c=0;

    or you can have a string spanning many physical line:
    ex- typedef structure NewType
    {
    int alpha, beta,
    long foo,
    floating bar
    };

    bottom line is it’s up to the developer to decide what he will use as an end of line/string character.

  29. Hello Paul for reasons obvious to me,
    I don’t agree with your NULL point of view

    The text found on this web page saves me a lot of typing why NULL is essential.
    http://everything2.com/title/null

    void <- Your examples with C++ and C, because these are unstructured languages, use only when there is no other option e.g. low level development and then even reconsider.

    void <- Your examples in Javascript. Javascript is a stinking unstructured pest, making web programming a real nightmare, please replace e.g. with Dart or Python. Your surely can't be using Javascript to demonstrate proper NULL handling?

    Algol W -anyone still using this language?- has NULL and appropriately so.
    Carpal tunnes syndrome? Write a macro to combine NULL/Empty conditions.

    How would you like a (relational) database system without NULLs? For example: is a zero value in a column AMOUNT == 0 or has it not (yet) been entered?

    btw: null (optionals) treatment in iOS/OSX Swift is excellent.

    Kind Regards
    Ted

  30. I wonder if behind the scenes the option concept is implemented in is_some (or whatever) by testing the this pointer against null?

    Two points of substance:

    1. The repetitive stress disorder problem is not solved. We still have to test against is_some to decide how to process an instance.

    2. The phone book illustration suffers from another implementation error (of which null is often used in special-case work-arounds) and that is to INTERPRET AN ASSIGNED VALUE AS A RESULT CODE. This is a far more frequent abuse of type safety than Hoare’s NULL, and in any non-trivial environment makes exception handling a nightmare. In most of the languages listed, proper practice is to pass the target by reference (i.e. as an OUT parameter) and return a specific result code. A failed method should never modify the caller’s context.

  31. Yves:
    Not exactly. “NUL” (one L) is the ASCII character, while “NULL” (two Ls) isn’t actually defined IN C (at least not in C89 — I haven’t tracked it lately), but is general defined by implementations as a pointer (16- or 32-bits) with the binary value of zero.

  32. Ted, that link says

    A vitally important part of nearly any programming language. While computer programs normally think in terms of particular values, sometimes you need to express the lack of a value.
    This sort of effect can be kludged via magic numbers like -1 or other arbitary values, but it’s better if your language of choice does it internally, eliminating any possible ambiguity.

    I agree completely with that. My point is that NULL is also such a “kludgy” value, that causes semantics to contradict types (whether dynamic or static). Maybes/Options are the “non-kludgy” solution to optionality.

  33. Mike McInnisSeptember 2, 2015 at 1:35 pm

    I agree with the comment referring to SQL. Checking for null is inherent to looking for populated fields coming from databases.

  34. Fred DaggwoodSeptember 2, 2015 at 2:36 pm

    The issue seems to me not the use of of NULL but the absence of nullable and non nullable types.

    My argument is something like:

    At times nullable types are needed (for speed mostly but not only)

    For type safety non nullable types are needed.

    Then we can impose compile time rules

    NullableThingy = NULL; //ok
    NonNullableThingy = NULL; // Not ok!
    NullableThingy = NullableThingy; // safe
    NonNullableThingy = NonNullableThingy // safe
    NullableThingy = NonNullibleThingy; // safe
    NonNullibleThingy; = NullibleThingy; // Not safe!!

    The last is till needed, but needs to be a different syntax that forces the programmer to write something for “If it is NULL then …”

    The better solution(s) are

    1) To have Non Nullable types as primitive and add the ability to overload as you suggest

    2) To have Nullable types and and add the ability to overload for a Non Nullable

    Sadly in most languages the primitive types do not have a NULL, for a very good reason, and do not allow the needed overloading of primitive types (for no good reason that I can see).

    I could also get into the argument of whether type safety is always a productivity gain or not. My option/experience is that type safety is useful sometimes and is not useful sometimes. I get tired of type safety being intrinsic to a language/compiler and assumed by people to always be a win.

  35. my biggest c# pain is when you use entity framework to retrieve a single object from the datastore. Consider this

    var x = myentity.firstordefault(x => x.key = someval);

    now, after this is called you have to do a
    if(x != null){
    }
    every single time
    otherwise any call to check a property causes null exceptions. If you accidentally miss this, boom, your program crashes.While the c# 6 ? notation will help its still not a great way to deal with it. Call me naive, but life would be easier if there was a fallback where if the lambda doesn’t match any records,then return an object that gets its values from the default constructor. If a string is null, then default the string to be string.empty to avoid a null exception when you try to trim it.

  36. Interesting article, with a good overview of lot of popular languages…

    You hadn’t stressed out that most modern languages, like Rust, Ceylon, Kotlin, are aware of the issue and try, in various ways, to address it.

    Rust eliminated null entirely, although I suppose it has to deal with it when interoperating with C; I guess it is restricted in an unsafe layer.
    JVM or JavaScript based languages have to deal with it, for interoperability reasons.
    But null isn’t so bad if it is tamed, it can still be used to signal absence of value.
    Ceylon solves the issue quite elegantly, putting it in the Null type, and forcing to declare explicitly if a type is nullable, by using union type:
    Integer cannot be null, and the compiler will complain if you try to put a null there; and variables must be explicitly initialized.
    Now, you can declare a return type, a parameter type or a variable as Integer | Null (abbreviated as Integer? since it is a common case, and it shows clearly the optional nature of the value), allowing to set it to null to mark absence of value (eg. when parsing an incorrect string value; better than throwing an exception, too verbose and performance hit).
    The nice touch is that the compiler is aware of such type, and forces you to test if a value is not null before allowing to use it. So, no NPE for Ceylon!
    Even nicer, only one test is needed in the code path following the test: the compiler knows the value is not null and therefore won’t force you to do the check on each usage.
    I find this solution very elegant, and since it is built in the language from the start, there is no risk of forgetting to wrap a value in an Option type…

  37. In addition to seconding Dan D’s comment about Objective-C not being as bad as it looks, I’d like to add that the coverage of Swift is still a little bit confused.

    The ! (unwrapping) operator is basically equivalent to the .get() method of Java 8’s Optional, only more compact and usable as an lvalue. unsafeUnwrap() is equivalent to ! except that it bypasses the actual nil check and simply assumes it has been passed an Optional.Some; you would only use it in extremely performance-sensitive code. (Hence the “unsafe” in the name.) Both of these are used with the Optional type (and its cousin, the slightly more dangerous ImplicitlyUnwrappedOptional).

    The various species of UnsafePointer can also be nil, but this has nothing to do with unwrapping, and UnsafePointer is a low-level feature that shouldn’t be used very often, only a step or two above the FFI stuff you excuse in other languages. They also allow pointer arithmetic, accessing unallocated memory, leaking allocated memory, and other dangerous shenanigans. Like unsafeUnwrap(), they’re explicitly marked “unsafe” for a good reason.

  38. Your comment about the nul character in C is out of place here.

    Null, as you describe it, is a value that is not a value. So the C pointer value NULL[1] is not a pointer, and that’s a good example of what you’re on about.

    But a char is not a pointer. A char is basically an integer type[2] of a particular size. For char values from 0 to 31, the values have names in the ASCII set. The value 0 is called nul – and it should be spelt so, with one ‘l’. But it’s just one normal value out of the set of values that a char can have. The originators of the C runtime library chose the character value 0 as the sentinel value for strings. They could have chosen any other value (but 0 was the best choice for many reasons). But the nul char is definitely not “a value that is not a value”. It’s just a value.

    Flick

    [1] For pointers: NULL and 0 may be interchangeable as pointer values in the C source code, but the compiler will have to do some fancy footwork if the address 0 could actually be a valid pointer value in a normal program… fortunately, in most architectures, it’s used by the hardware for vectors or whatever, and cannot be a valid address for data. So, yes, the value 0 is a null value for a pointer to char. But for the char itself, 0 is just another value. Pointers are not chars, and chars are not pointers.

    [2] and yes, I know that people sometimes have to put the (non-zero) integer value of an address into a pointer. But integer types are not pointers. it’s just that, on most architectures, such casting is possible.

  39. I enjoyed reading this, perhaps because I agree with the position. I would to love him to give UTF-16 the same send-up as NULL. I’m skeptical that I would agree on that one, but it would be a good read nonetheless.

  40. I agree with Ted.

    We could write examples in any language that would fail in some way, and testing a pointer before using it is C/C++ 101 – no self respecting developer I know would admit to writing such poor code. Further – I would actually want an exception to be thrown in that case so that I could identify the root cause of misbehaving software which is clearly defective code.

    C++ is a powerful language because of its versatility, which means its not for everyone. For everyone else there’s VBA.

  41. For the sake of clarity: I see in your table that you present C++’s NULL as…. NULL, but NULL has not been recommended practice in C++ for a _long_ time (close to two decades now).

    NULL is a macro which does behave correctly in C but does not work nearly as well in C++. The best that could be done with a C++ NULL is to map it to literal 0.

    As such, in C++, we’ve used 0 for a while, and have bee using nullptr since C++11 (four years now) as 0 worked Ok but led to some annoying ambiguities.

    Cheers!

  42. Thanks for this in-depth NULL handling.

    As a Ruby dev I can not count the times I wrote nil checks. Still I would only add NullObjects / typed Nones in specific cases, as they seem to be a tad too much work in most cases.

    I would like to see Erlang and Elixir in the table. And to read your opinion about Erlang not having one real NULL while Elixir adds the :nil atom to the language.

  43. […] 聿言 业界 围观5jQuery(function($){$.get("http://xunnet.org/fo_ajax?ajax=getPostViews&postID=263&quot;,function(data){if(data.length < 10)$('#number').text(data)});})次 留下评论 编辑日期:2015-11-11 字体:大 中 小 本文由 伯乐在线 – SamLin 翻译,黄利民 校稿。未经许可,禁止转载! 英文出处:Paul Draper。欢迎加入翻译组。 […]

  44. […] Passend zum heutigen Freitag den 13. habe ich einen sehr guten Artikel zum gruseligsten Fehler in der Informatik gefunden: The worst mistake of computer science. […]

  45. […] The worst mistake of computer science […]

  46. Really liked this article, but your rating chart is broken! When I first read it a few weeks ago, I could see the star ratings in the table, but they’re gone now.

  47. The BIGGEST mistake in CS is the modern movement of assuming it’s a good idea to automate memory management to accommodate mediocre computer science graduates who have never acquired the discipline to manually manage memory but instead rely on expensive and problem-prone devices like automatic garbage collection and automatic reference counting. The current spate of high-level languages that attempt to hide how the von-Neumann architecture actually works might make it easier for novices to enter, but ends up making everyone jump through ridiculously unnecessary hoops when trying to leverage more efficient binaries written in “legacy” C or C++ simply because NO ONE EVEN KNOWS WHAT A POINTER IS ANYMORE. A zero is a zero, whether you call it a NULL, a nil, or a nullptr. If you can’t keep track of a pointer, you have no business calling yourself a programmer.

  48. […] The worst mistake of computer science at http://www.lucidchart.com Uglier than a Windows backslash, odder than ===, more common than PHP, more unfortunate than CORS, more disappointing than Java generics, more inconsistent than XMLHttpRequest, more confusing than a C preprocessor, flakier than MongoDB, and more regrettable than UTF-16, the worst mistake in computer science was introduced in 1965. […]

  49. […] was a blog post a while ago that declared that NULL was the worst ever mistake in history of computer science. The fact that nulls existed meant that code always has to be written to handle them, otherwise […]

  50. You should learn to read more carefully. Tony Hoare did not call “null” a mistake, he called *null references* a mistake. Null as a concept predates ALGOL W and is quite useful when called for, that is, when you need a value that is not a value. This much is clearly demonstrated by type systems that provide for optional values. Speaking of which, you’ve left out C#’s Nullable on your chart.

  51. […] 计算机科学领域最糟糕的错误 了解更多为何不使用 […]

  52. I can’t see any notice about Smalltalk, that has no null as primitive.
    I can’t see any notice about treating null problem by Null Object pattern.

  53. Reinier PostSeptember 7, 2016 at 5:54 am

    The NULL reference problem is an instance of a more general problem: static types cannot adequately capture all types we really want to use.

    Our computations can be regarded as functions, taking values and producing values.
    Static typing is supposed to guarantee that functions aren’t called on invalid values.
    Ideally, for every function we ever use, we’d have a type to for both its domain and its range.

    But functions aren’t always surjective or injective. The NULL problem is the case where a function produces an object of a certain type but not always – hence, it can produce NULL – and another function taking the result. Option/Maybe elegantly addresses that. But there are other cases. Many operations on numbers, for instance, are undefined in some cases (e.g. division by zero). So in programs that divide, ws should really have a “nonzero integer” or “nonzero floating point number” type. But overflow cannot be dealt with in that way. So it is not possible to completely eliminate the problem in general.

  54. Does Ada have null?

  55. Scott JonesOctober 28, 2016 at 2:31 am

    Julia is another new language, which does not have an overarching NULL.
    It does have a type, `Void`, whose only possible value is `Void()` (also called `nothing`)
    You can also create things of the `Nullable{T}` type, which can hold a value of type T, or nothing,
    in a type-stable fashion.

  56. Writer is ignorant.

  57. Dmitry PashkevichJanuary 9, 2017 at 3:56 pm

    Care to elaborate?

  58. Javascript has a copy of the Java Optional type:

    https://github.com/JasonStorey/Optional.js

    Plus there are multiple other implementations in community libraries.

    Therefore, it seems like it should get 3 stars for “Has NULL. Has an alternative in a community library.”

  59. Dmitry PashkevichJanuary 14, 2017 at 6:26 pm

    Hi Ben! While there are community libraries for the Optional pattern in JavaScript, it has multiple “NULL”s, hence it gets a one star per author’s rating system.

  60. Hi Paul!

    What do you think Crystal Programming Language?

    Crystal tries to check nil references at compile time.

    https://crystal-lang.org/2013/07/13/null-pointer-exception.html

    Thanks you for the great post!

  61. […] Paul Draper розмірковує про NULL як найгіршу помилку інформатики […]

  62. catbreathMay 2, 2017 at 4:12 am

    C# having Nullable is kind of irrelevant to its score, because it does nothing to save you from the worst thing about c# – the Null Reference Exception – because the type system has no way of expressing (or proving) that a reference cannot be NULL.

    You do get NULL propagation, but again the compiler can’t enforce that you use it, or that you use it correctly, or whether or not an expression using it will evaluate to a non NULL value.

    This is why c# gets a low score.

  63. […] vez lo más importante es que es inmune a el error del billón de dolares, el famoso Null Pointer […]

  64. For JavaScript there are several options directly from npm: maybe, optionals-js, optional, optionals, …

  65. […] Option/Maybe type, for example, is the alternative to nullable references: something that’s been heavily criticized for making programs prone to unexpected NullReferenceExceptions. The idea behind both of those […]

  66. […] Now let’s get back to those error messages you are probably still seeing. When you call readline() there is a chance that it will return null as the value. Essentially in programming null means there is nothing there, and so if you attempt to do an operation on a null value your program will crash. The idea of null has been referenced as the billion dollar mistake (because null pointer exceptions cost companies a lot of money of money), you can read more about why null was a bad idea here. […]

  67. An Old HackerJune 26, 2017 at 7:07 pm

    Ironically, while documents for Go decry the use of -1 as an error indicator in C, the recommended use of panic() / recover() is to check if the panic value is nil. So, if an error that leads to a panic also fails to set the panic value…
    To fix:
    func f() {
    var nopanic = false
    defer func () {
    if !nopanic { panic_value = recover() ….}()
    do stuff that might panic
    nopanic = true
    return = return_value
    }

    Just be certain that return_value is not a computation that can panic!

    The intersection of database NULL and language nil is a rich source of surprising behavior. Just don’t.

    The example on Ruby is unfair. Certainly, bad programmers will do as indicated, but Hash#has_key? is the way to check that a key exists. But also be aware that Hash.new permits the creation of default values, and also the use of a block to compute the value of a missing key. Ruby does not give you enough rope to hang yourself–it sets you down by yourself in the middle of a rope warehouse.

  68. Hilariously missing the point. Recommend using Optional while completely missing that the Option class stores a null value internally. Optional wouldn’t be possible without nulls, unless you hackily exploit an empty list (which is still very deep down a null terminated array). Safely wrapping nulls doesn’t remove them from existence, it just makes it not your problem.

    I’m all for better null safety, much in the way Kotlin and Swift provides, but they still HAVE nulls. Without nulls, optional fields are impossible to represent via object oriented programming. Even wrapping your objects in an Optional class still relies on the existence of the null value internally…

    I really don’t get why people hate null so much. Just get better at writing unit tests, or use one of the many languages that provide better null safety.

  69. Djeefther SouzaSeptember 8, 2017 at 10:58 am

    In well-righten-idiomatic C++ NULL is really rare, just and only just when we need dynamic allocation. Because objects are by default in the stack static allocated, and references could not reference to null. And of course, you could and should use Optional, or smart-pointers with good exceptions for deference nulls (not standard ones)

  70. Nulls are perfectly fine and asking for them to be removed from a language or have a language designed such that it does not have the concept of null is both ridiculous and impossible. This is because all languages that “don’t” have a null have the Option or Maybe type and they are just another incarnation of NULL. To put it another way, There is no difference between the Option/Maybe monad and NULL!

    Additionally, maybe is a subclass of Either, specifically Either. A Maybe instance can Either be a value OR it can be null. All variables in the “{}” type languages (C,Java, etc.) don’t have the concept of a pure reference, instead every variable is a maybe. The reference can either be an object/value OR it can be null.

    So please, don’t hate null. Null is extremely important. Hate the people that don’t understand that when you declare and int*, you are really declaring Either. If your compiler understood this distinction then you would never get a NullPointerException.

  71. Sir C.A.R Hoare, a pioneer in Computer Science and co-researcher of Edsger Dijkstra (ALGOL) and Ole-Jørn Dahl(Simula) had the courage to admit it – billion dollar mistake. Though null reference was first proposed for Object-oriented Language implementations, it was first more visible in dataprocessing when relational databases supported the concept of NULL. Until then database implementations (IMS/IDMS etc) were happy with 0 or spaces. Though philosophically intuitive and elegant, NULL in database implementations did not serve any real practical purpose at all. In fact, just like Object-oriented Language implementations, it gave rise to variety of anomalies in SQL. The purpose of databases is to keep concrete information and not to play around philosophy! In SQL, 1+0 = 1 but 1+null = NULL and AVG(1,0,NULL) is 0.5 ! Min (1,0,null) is 0 and so on. And the NULLS in databases propagated to Java gave rise to further troubles in programming.

    A seldom discussed issue is how the databases implement null behind the scene! Many thinks that NULL does not take any Space. In fact each nullable field has an exstra hidden field in the database system which keeps the information on null! When you use embedded SQL one needs an extra null-indicator byte for every nullable field and needs programmatic checking! In most relational database implementations, NULL is the default and one needs to specify NOT NULL in definition if you want to override. I have seen many databases where nullable field containing NULL without the application people not really aware of it and the dire consequences!

    So 100% agree with Prof Hoare and the Young Author here – Paul Draper – that NULL indeed was a mistake. Huge mistake.

  72. Multiple return values from a subroutine, as in Go and some others, get around these problems nicely. One return value is the error-return, the other is the desired value, if any.

  73. Great post. I was checking constantly this weblog and I’m inspired!
    Extremely useful info specially the final phase 🙂
    I care for such information a lot. I was seeking this particular info for a long time.
    Thanks and best of luck.

  74. Reinier Post has the correct answer, however few people know those mathematical terms. This blog post is promoting a so-called solution which actually is not the correct solution, and means lots of extra unnecessary work. The underlying problem is that most languages follow too closely the computer hardware’s limitations. An integer can only a numerical value, when in practice this is insuffient. There are simply too many instances where you want to store a numeric value and some other answers. Like when asking for someone’s age, you could have a number, or you could have “decline to state” as a valid answer. Most languages have null or undefined and a value as the only two kinds of numeric values. This is often not enough, and things get funky trying to reserve values or use -1 as meaning something special. Once you do some arithmetic on that -1 you suddenly change “decline to state” into a 3 year old…. Beads is the only language I know of besides Mathematica that can extend arithmetic, so that division by zero, infinity divided by infinity, etc. are all defined, not to mention all the special cases which are so nasty to program. None of the languages mentioned above, Kotlin, Go, Swift, Ruby, etc., do it correctly, and it is the slavish following of tradition that is holding us back. Time for the liberation of arithmetic from the ancient hardware limitations.

  75. Arguably, Go should have one star since the typed nil interface nonsense generates infinitely many nils*. Typed nils go so far as to break transitivity of equality: https://play.golang.org/p/nRH6yJV0d6e

    * An example: https://golang.org/doc/faq#nil_error

  76. […] First, what is nil? It’s just null as in other programming languages. But most other languages treat nulls as special values rather than an absence of the them. There’s a great talk about this, check it out here: NULL: The worst mistake of computer science? (2015). […]

  77. […] First, what is nil? It’s just null as in other programming languages. But most other languages treat nulls as special values rather than an absence of the them. There’s a great talk about this, check it out here: NULL: The worst mistake of computer science? (2015). […]

  78. we replace it with Option in Rust… now instead of people having crashes everywhere due to null pointers, they have crashes everywhere due to people writing Option unwrap() everywhere. it has basically gone from having optional ‘if null’ guards at the beginning of every C++ function that rakes a pointer, to “assert(null)’ enforced by the compiler. thats basically what unwrap() is, assert on null.

    in the end the invention of null was as important as the invention of 0. it is different but it fits in to our existing system of data and if we got rid of it completely we couldn’t do anything. it is how we deal with it that is important.

    and as with 0 it will take us a while as a civilization to “Get there”. we still have disagreements with implementations of “division by 0” even thousands of years after it was invented. some people say it should be NaN, some Infinity, some think you should crash instantly, some just say covert it to 0. then we have geometric interpretations like mapping infinity to a finite point in another dimension, through projection.

    as soon as mathematics invented 0, people had to deal with. as soon as mathematics invented the null set, people had to deal with it. as soon as people invented Addresses for things, there were invalid addresses. Even the post office has to deal with NULL, in the dead letter office.

    inside a computer its just a bunch of circuits holding a voltage, after all. there isnt some system where every single input and output can be thought through entirely otherwise nothing would ever get done.

  79. […] According to many, null pointers are one of the worst things to happen to computer science. […]

  80. It’s not java.lang.Optional, it’s java.util.Optional.

  81. Interesting article, even though parts seem subjective and exaggerated, there are a few good points.

    I’d advise to have a look at Kotlin and C#, both handle that pretty well. Kotlin probably better, even though for all the care it brought on the compiler side, it is still based on the Java bytecode. Thus it suffers from the type erasure issue and could have weak points at runtime that could bypass the nullable layer.

    Java.Optional came too late, and it’s too dependant on the adoption, so unfortunately unless those features are embedded in the language from day 1, there is little that can be done.

    And no matter what care you bring, there will be cases in which NULL is necessary and must be handled manually (see typical cases of UI callback and the necessity of the !! operator in Kotlin, for instance). So ultimately the user must pay attention – it’s their job after all, and NULL will remain a necessary evil.

  82. The key-value store example is ridiculous.
    The problem is not nil. The problem is that your model does not fit the return type of the store api.
    You want a type that represents not-cached, no-number, number and misuse the store return type that represents not-in-store, valid-value.
    The some solution just adds another state to the return type to fit your 3-state-model.
    What if you need a 4 state return type for your model? Do you use some(some(‘000’))?
    This problem has nothing to do with nil.

  83. This is a great article, thanx for publishing it. But I am not sure we should throw “NULL” under the bus, just yet. It seem pretty canonical, and as I grab my copy of “pdp11 peripherals handbook”, on page B4 at the back, is the 7-bit Octal representation of the ASCII code from 000 (NUL) and 001 (SOH) up to octal 177 (which of course is DEL).

    Check my “Gilman and Rose”, “APL – an Interactive Approach”, and page 304 explains how to use NULL to set up data-tables with embedded Nulls, but which print and it “takes NO time, just as if it isn’t there at all!” (used instead of the “idle” character.) Grab my Windows APL Plus-III for Windows (from Manugistics), and the []AV (Quad-AV), for the APL “Atomic Vector”, (256 chars long now), starts with NUL, which is also []TCNUL (a terminal-control character). Tying to get rid of the NULL character, seems like trying to get rid of the “U” in English. (Like the original Latin. Don’t need it, just use “V” instead, right?)

    See, you left out my fav. language, which is APL. A sensible language like APL, which allows one to operate at a higher level of abstraction, avoids most of the issues you describe that can create problems by allowing NULL. A string in APL is a string, and can have any characters (including NULLs). An operator, called “rho” allows one to determine if the string is zero-length. APL data variables can be numeric or characters, and can be extended to any level of dimensionality that one wishes to use.

    Really, you should no more be using “pointers” and mucking around with machine memory addresses, than you should be concerned about the voltage levels your cpu’s are using. The great mistake in computing was “C”, a weird, low-level retrograde step that is still causing grief.

    APL examples require a special character set. I put up a simple example, of using a data-table that contains NULL characters, on my little website. It’s on the first page, and shows how the NULL character, since it does not create any output when displayed, can be useful in the construction of simple English sentances which have correct syntax. http://www.gemesyscanada.com

    The APL used is Windows APL from APL-2000 (formerly Manugistics APL, and before that, it was STSC APL.)

    Really, used safely, NULL can be kinda of cool. It’s there – but it’s not there. It’s an abstraction, yet it is real. Sorta like love and justice and freedom. 🙂

  84. Good write-up. Though I actually think NULL makes sense in C when developing OS code or writing code for embedded systems. :p

    Also I would point out that x86 hardware, for example, has been able to trap null pointer dereferences for a long time.

    the “worst” mistake? i dunno about that
    Every other language I

  85. Note that C# 8.0 has a nice (optional, due to compatibility issues) feature of making nullability explicit: https://docs.microsoft.com/en-us/dotnet/csharp/nullable-references

  86. I think the article mixed up two different concepts: one is in-band signaling, where NULL or 0 is just the most commonly used signal, and -1 comes second. The second is invalid pointers, for which C-like languages use the value 0.

    In-band signaling is when we overload a value with two meanings: the result from a calculation in normal situations, and a signal of a special situation (usually an error). The item 4 in the article clearly illustrates this phenomenon and the harm it causes, but it doesn’t happen only in programming. Imagine to fill out a paper form where you have two spaces for the age. 00 might be equally used for a newborn or when age isn’t known. Even worse, if you write “unknown” as place of birth, is it because you don’t have this piece of information or is there a city called Unknown somewhere in the world? If you have structured your data in this way, you can’t blame the tool (be it a programming language or the paper the form is written on) for bad design!

    In-band signaling is everywhere and has nothing to do with coding or languages or even with the NULL value and I think it shouldn’t have been included here.

    Invalid pointers, on the other hand, are specific to languages and their libraries and do depend on choices made early on.

    We often declare functions as returning a pointer type (let’s say a FILE*), meaning by that that they return a memory address. But it’s not true. They return a value that can be an address or zero, which is not a valid address. We then pass this value to other functions that expect the same pointer as input, meaning by that that they expect a valid address. Do you see where the problem is? Same type declaration, different meanings. C calls them in the same way, FILE*, but the two ranges don’t match.

    You wouldn’t do the same mistake in math. For example, you would think twice before using the result of a sin as argument of a log, because a sin ranges from -1 to +1 while log is only defined for positive numbers. Of course you CAN do it, if the boundary conditions are well defined.

    Coming back to programming, if we define A as the set of all valid addresses, and A’ as the union of A and zero, troubles happen when we take a value that belongs in A’ and pass it to a function that expects a value in A. Yet C has no syntax to express this difference.

    As many have commented, overlooking such details is surely a user’s mistake, but C is to blame for allowing A and A’ to be expressed by the same type, and the standard C library for abusing this.

    Now imagine a dialect of C where a pointer can’t be zero. The standard library’s fopen() function can’t be declared as returning a FILE*, because that would exclude zero while fopen() indeed can return zero in case of error. In this hypothetical C dialect, fopen() would return a new type, the union of a FILE* and zero and this new type wouldn’t be accepted as input for functions that expects a FILE*. You would have to explicitly convert it to FILE*, but the compiler would allow you to do so only after checking that the value effectively is not zero.

    This is exactly what other languages do with the so-called “optional” or “maybe” types. The fopen()-equivalent function returns a maybe type that can’t be passed as-is to other functions; instead it must be checked and converted to a proper pointer before passing it around, an operation often called unwrapping.

    Please note that, even though I singled out the value zero, there’s nothing intrinsically wrong with it. For example, many functions that calculate a positive integer as result, return a negative value in case of error. Again, we are tempted to pass it as-is to other functions that accept an integer as input, and the C language allows it because the type is simply int, not unsigned-int-or-a-negative-signal that can’t be used where a true int is expected.

    I think this is the message Paul intended. Not that NULL is evil, but that some languages make it easy to abuse it (and some libraries definitely have abused it), while others have a richer syntax that can tell when a returned value is the expected one or a signal.

  87. In modern C++ nullptr is a nice thing because it’s opt-in most of the time: objects can be stored by value and passed by references if they can’t be in non-existent state and stored as std::optional and passed by pointers if they can.
    Even if some library developers decides that all the values should be passed by pointers to/from the library, it can be easilly wrapped to use references instead.

    So nullptr is a good thing in C++, bad thing is having invalid states for classes that should not have them (init, isInited, isValid, hasValidObject, you name it).

  88. I completely disagree with the concept that nulls are bad in computer science. What is bad is languages implementation of it, and developers not accounting for them correctly.

    In my first argument, let’s apply the logic of nulls to a real life scenarios. Imagine if you have 3 balls, 1 red, 1 white, and 1 blue. If someone ask you to hand them the green ball what you would not create a new ball with no properties and return that to the person. You would tell them it does not exist, which is essentially what a null represents.

    My second argument is debating the idea that nulls create a cluster of time bombs in you application. This is due to potential null references. To that, my response is so what? If a null is used when a value is expected I want an exception thrown at that location. What I don’t want is the application continuing in its calculations returning nonsensical results. Then I have to then hunt down the exact reason why, which may not be obvious. The null reference exception give me the exact location where the issue is, and why it is occurring. If you find yourself having a ton of null references, be a better programmer and null check. If the language makes it difficult to do so, that is an issue in the language, not the existence of nulls.

    My third argument debates the idea of having to null check. Again, so what? This does not prevent the programmer from having to validate an object before executing tasks against it. Imagine a database object that contains an int for a primary key called “Id”. I query the db for an object. IF the db returns null, I check for null and handle that situation. If the db returns a empty object, I still need to validate that the the id does not equal 0. Otherwise I run into the same issue of potential nonsensical data in the following tasks.

    Basically, nothing is a perfectly valid value. We should not create an empty object out of thin air to represent nothing. Logically that does not make sense. Be better programmers and account for these possibilities or use languages that can handle them better.

  89. I think you have rather missed the point of most of the replacements we suggest. The entire point of getting rid of null, as it is implemented in most programming languages, is to actually move catching the errors earlier in the programming process, rather than later. The suggested replacements uniformly turn possible null pointer exceptions into compile time errors, rather than a run-time errors. Note that compile-time errors also have line numbers attached to them, can be auto-highlighted in the IDE, and are just generally easier to reason about.

    As for “Well, the programmers just need to be good enough” arguments, well, why? Why should we have to be good enough? Why not pawn this kind of nonsense off onto the compiler rather than forcing us to take extra time to think about it every time? Computers are much more reliable about it than people ever will be. That’s rather what computers (and compilers) are for. We could, in principle, still be working in assembly, but that’s a waste of everybody’s time, so we don’t. Make the computer accommodate humans, rather than the other way around.

  90. Hey! Thank You for writing such an informative blog- content.
    It was really helpful. Do keep sharing blogs on such related topics!

  91. Thanks for sharing such a great article

  92. In this example:

    cache = Store.new()
    cache.set(‘Bob’, Some(‘801-555-5555’))
    cache.set(‘Tom’, None())

    alice_phone = cache.get(‘Alice’)
    alice_phone.is_some # false, Alice is not in cache
    alice_phone.get…

    Shouldn’t that last line cause a runtime error? I don’t understand how this ununions the weird static/dynamic error you said NULL causes.

  93. Appreciate your perspective. Looking forward to more posts!

Your email address will not be published.