In a previous post I introduced the libpq PostgreSQL C library and used it to create a database, a few tables and a view. In this post I will demonstrate inserting, updating, deleting and selecting data using the database created in the previous post.
For this post I will write a simple implementation of a 1-dimensional cellular automaton in C. The concept of cellular automata has existed since the middle of the 20th century and has grown into a vast field with many practical and theoretical applications.
A cellular automaton consists of any number of "cells" arranged in 1, 2, 3s or more dimensions. Each has a state associated with it (in the simplest case just on or off) and each cell and therefore the entire automaton transitions from one state to the next over time according to a rule or set of rules.
The number of dimensions, the number of possible cell states, and the rules can become arbitrarily large and complex but for this project I will implement the simplest type of one-dimensional cellular automaton, known as an elementary cellular automaton.
I recently wrote a post on Pascal's Triangle and in this post I will write a program in C to implement a Galton Board simulator, a Galton Board being an actual physical gadget following the Pascal Triangle's probabilistic characteristics.
A while ago I wrote an implementation of the Soundex Algorithm which attempts to assign the same encoding to words which are pronounced the same but spelled differently. In this post I'll cover the Levenshtein Word Distance algorithm which is a related concept measuring the "cost" of transforming one word into another by totalling the number of letters which need to be inserted, deleted or substituted.
The Levenshtein Word Distance has a fairly obvious use in helping spell checkers decided which words to suggest as alternatives to mis-spelled words: if the distance is low between a mis-spelled word and an actual word then it is likely that word is what the user intended to type. However, it can be used in any situation where strings of characters need to be compared, such as DNA matching.
I started off calling this project "Calculating Pi" but soon realised that I needed to rename it "Estimating Pi". Pi is an irrational number starting off 3.14159 and then carrying on for an infinite number of digits with no pattern which anybody has ever discovered. Therefore it cannot be calculated as such, just estimated to (in principle) any number of digits.
The C99 standard introduced the _Bool type as well as stdbool.h which allows you to use bool, true and false. _Bool uses a byte to store true/false, yes/no, on/off or whatever the semantics of your program might be, but of course you only really need 1 bit so 7 bits are wasted. Most of the time this isn't worth worrying about but in some rare cases where you need a lot of Booleans it might be worthwhile looking into ways of being a bit more efficient with memory usage. This article presents my particular approach to doing this.
This post will demonstrate programs which each perform a single very specific task but which can be chained together in such a way that the output of one forms the input of another. Connecting programs like this, or piping to use the correct terminology, enables more complex workflows or processes to be run.
This post includes just three short programs to carry out the following tasks:
The numbers in the graphic below form the first five rows of Pascal's Triangle. The first row consists of a single number 1. In subsequent rows, each of which is has one more number than the previous, values are calculated by adding the two numbers above left and above right. For the first and last values in each row we just take the single value above, therefore these are always 1.
Pascal's Triangle in its conventional centred layout
(If you do an image search for Pascal's Triangle you will find plenty more, most frequently in a honeycomb grid, sometimes animated, but always nicer than mine. I'm a software engineer, not a graphic designer!)
C is not, of course, an object oriented language and does not even have any discernible features of one. The three core characteristics of object oriented programming are frequently stated to be encapsulation, inheritance and polymorphism, but a more fundamental characteristic is the combining of data (or properties) and the functions (or methods) which work on that data into a single entity which can be used to access both.
This can be simulated in C by adding function pointers to a struct. The nuts and bolts of doing so are not as slick as in a real OOP language such as C++ and frankly look a bit clunky, but in this post I will write a short demonstration of the principle. Whether it is actually worthwhile is a moot point!
Soundex is one of a number of phonetic algorithms, assigning values to words or names so that they can be compared for similarity of pronounciation. It is probably the best know such algorithm as it is built in to most major RDBMSs, as well as PHP and other languages.
It doesn't take much thought to realise that the whole area of phonetic algorithms is a minefield, and Soundex itself is rather restricted in its usefulness. In fact, after writing this implementation I came to the conclusion that it is rather mediocre but at least coding it up does give a better understanding of how it works and therefore its usefulness and limitations.
Wikipedia has a surprisingly brief article on the topic Soundex on Wikipedia which you might like to read.