Artem's blog

Mainly .NET (C#, ASP.NET) and my projects

The Power of Finite Differences in Real World Problems

Introduction

Most of the people are certainly familiar with Calculus already from high school and the beginning of university. It is usually associated with concepts like infinitely smallbehaviour at infinity, finding the area under the graph, and finding the rate of change. It seems to be so much about infinity (which is difficult to imagine) but not that much about finite values. So why not look at finite differences instead? In this article, we are going to see how powerful the concept difference is, and how we can apply it to real world problems. First, we are going to look at how the difference can be found in a sequence. Secondly, the way we can find the next term and later on how we can get a formula to generate the nth term. Finally, we are going to use the idea of finding the nth term to find an expression of the sum.

The difference operation

It is quite intuitive, once we see a sequence, to try to find the difference between adjacent terms. If we are not satisfied, we look at the difference of the differences, and so on. Our aim now is to construct a method that will behave like we normally would in this situation. First, however, let’s look at what happens when we take the difference of each term.

\(a_0\)
\(a_1-a_0\)
\(a_1\) \(a_2-2a_1+a_0\)
\( a_2-a_1\) \(a_3-3a_2+3a_1-a_0\)
\(a_2\) \(a_3-2a_2+a_1\)
\(a_3-a_2\)
\(a_3\)

We can take the difference as many times as we like, but given that a sequence was constructed with only addition, subtraction, multiplication and division i.e. it was a polynomial, there is going to be a point when we can stop taking the difference. The important thing now to realize from this figure is that the difference expression contains the coefficients from the Pascal’s triangle. This is good if we want to take the difference many number of times. The code below utilizes this observation.

Finding the next term

Before we proceed in finding the next term and ultimately the polynomial describing the nth term, let’s focus our attention on a real world example. Say we have the sequence \(\{n^2\}\) i.e. \(1, 4, 9, 16 \dots\). We assume that we don’t know the nth term, i.e. we only know some values of the sequence. The common procedure in this case is to look at the differences, similar to the previous section. For example,

\(1\)
\(3\)
\(4\) \(2\)
\( 5\) \(0\)
\(9\) \(2\)
\(7\)
\(16\)

To be sure that we’ve found a pattern, we should end up at zero (assuming a polynomial generated the sequence). The interesting thing about this figure is that we know that each difference in the table can be expressed in terms of the difference expressions in the previous section. So, \(3=(4)-(1)\) and \(2 = (9)-2\times (4)+(1)\). This idea hints us to a possible way of finding the next term in the sequence. Since we know that the second difference \(2\) is constant, we can find an expression for the next term by some simple arithmetic. That is, \(2=a_n -2\times a_{n-1} + a_{n-2}\).  Thus we can find the nth term given the two terms before it. This is done by the following code:

Finding the expression for the nth term

This is a bit more tricky to find in contrast to what we’ve done so far. The nth term is very dependent on the number of times the difference operation has to be taken before we end up at zero. In the previous example, we had to take it three times to get to zero. The polynomial we got had the second degree. It turns out that if we had to take the difference \(n\) times to get to zero, the polynomial is of the degree \(n-1\). This is quite useful, because if we know how the polynomial looks like, the only thing we need to find are the coefficients (more in-depth tutorial).

In the sequence \(1,4,9,16\dots\), we had to take the difference three times, so the polynomial has the second degree, i.e. \(ax^2+bx+c\). We have three unknowns, so we need three data points to solve the system of equations. That is,

\(\left[ {\begin{array}{*{20}{c}} 1 & 1 & 1 & 1 \\ 4 & 2 & 1 & 4 \\ 9 & 3 & 1 & 9 \end{array}} \right]\)

When we reduce this matrix to echelon form, we get that the coefficient for \(n^2 \) is \(1\), and zero for the remaining terms. The code below does this task.

The coefficients we get correspond to a term in the polynomial. For example, if we get \(3, 2, 1\), the polynomial is \(3x^2+2x+1\).

Finding the closed form for the sum

When we first think about how we can find the closed form for a sum, we might conclude that it is difficult. In Concrete Mathematics – A Foundation for Computer Science, an entire chapter (chp. 2) and a great part of other chapters is dedicated to sums, and the way we can find the anti-difference. At least, that was my thought when I got this idea. But later, an interesting thought came up to my mind, and that is to treat the sum as a sequence and the sequence as the difference of the sums’ terms. Let’s clarify this. If we have the sequence \( 1,2,3,4,\dots\), we can construct the partial sums, i.e. \(1, 1+2, 1+3+4, 1+2+3+4\), thus \(1, 3, 6, 10\). But the partial sums form a new sequence which we can analyse in a similar way. This means that can we can reuse quite a lot of code. Now, since in the original sequence \( 1,2,3,4,\dots\), we have to take the difference twice to get zero, in the new sequence \(1, 3, 6, 10\), we have to take the difference three times. The code for doing this is shorter, in contrast to the previous ones.

Limitations

One of the limitations of this code/algorithm is that we can only use sequences that have been generated by a polynomial. Exponents are not allowed, although \(n^2= n\times n\), so that works. The reason is quite simple. Formally, we define difference as an operator on a function (similar to derivatives) as

\(\Delta f(x) = f(x+1) – f(x)\). Let’s say we have \(f(x)=2^x\). If we take \(\Delta(2^x)=2^{x+1}-2^x=2^x(2-1)=2^x\). Not good, no matter how many times we take the difference, we end up with the same thing we had in the beginning. This is why this particular algorithm does not work in this case. It is possible to add an extra step that checks against known sequences each time a difference is taken. This, I will do a bit later! 🙂

Source code

You can find all additional methods in the FiniteCalculus class in Mathos.Calculus (Mathos Core Library). Also, you can see this code in action here!

Application of knowledge about number bases in problems

During the 2 hours journey to Stockholm from Bjursås, I found an interesting way of illustrating solutions to problems with a certain pattern. In this case, it was a recursion problem. The problem is as following (From Concrete Mathematics – A Foundation for Computer Science 2nd edition):

Problem

Solve the recurrence:

Eqn1

Assume that

Eqn2

Solution

When we look at several cases a pattern is going to be observed quite quickly.

Eqn3

As you might see, the fifth and sixth terms turn out to be the initial conditions that were set in the beginning (see the first and second terms). Therefore, this recurrence will produce a repeating pattern.

The question we face at this stage is how the following can be described as a single expression.

My solution consists of five main steps:

  1. Identify the common expression, i.e. the expression that contains all terms. In this case it’s the third expression. The expression should allow you to use it to express other cases by removing some terms. For example, the second term can be expressed in terms of the third if alpha in the numerator and beta in the denominator are not present. This is achieved by taking the terms that should not be present to the power of zero or multiply by zero, i.e,
    Eqn4
  2. Note down when each term in the common expression is present. If the term is present in specific case, write one, otherwise zero. For example, alpha in the numerator is present in cases 0, 3, 4. It’s also present in case 5, but we don’t take that into account because the pattern continues from the beginning. Now, do the same for all terms in the common expression so that you get an array of 1’s and 0’s for each term:Eqn5.
    You might wonder why there is a set of numbers in front of each term in the numerator and in the denominator, and that each term is raised to a set of 1’s and 0’s. Well, since we are generalizing each case in terms of the common expression, we can now say that if the case is zero, the set should be replaced by the first number in that set; when we deal with case one, we look at the second item, and so on. The reason why the terms in the numerator are multiplied by the set, while the ones in the denominator are raised to it, is because of the different operations. In the numerator, we are dealing with addition, and the only way to get rid of say alpha is by multiplication with zero. In the denominator, multiplying by zero will result an undefined expression. We don’t want undefined expressions, so we raise the terms in the denominator by zero or one. The effect is similar, we get 1 each time any term is raised to it, and it will therefore not affect the value of the fraction.
  3. Observe the pattern in each set: By looking at this as a future computer scientist, a clear pattern was observed – the digits in each set represent coefficients of a number in radix 2. Interesting, maybe we can replace each set by a number instead. {1,0,0,1,1} represents 19, while {0,1,1,1,0} represents 14. This turns out to be a good idea, so we need to find a method that will return each digit in the binary number separately. My version of such function contains subtraction, floor, exponent only and can be found here in pdf.
  4. Understand the new concept: The function I refereed to in step 3 looks as following:

    It contains three parameters and some interesting properties.  Basically, the n is the number in base 10 that we want to convert to base 2, b is the base (in our case it’s 2) and k is the index of the digit in base 2 (so, since 14=1110, then when k=0 this function will return 0). There are both good news and bad news. Let’s start with the bad news. If we want to get a sequence like 0,1,1,1,0, it would imply the number 14. However, when we convert the number 14 back to this sequence, we get 1,1,1,0. In other words, no matter how many zeros we have on the left hand side, this will not affect the number in base 10. That is, 00011 has the value 11, but given 11 we do not automatically get the number of zeros that we started out with. Well, what’s the good news? It turns out that we can exploit a limitation of this function that will allow us to store these missing zeros. The limitation is that it does not support decimal numbers, since when the ratio between n/b^k gets small enough, the floor of it will always be equal to zero. This is a good thing as it allows us to generate additional zeros in the beginning.
  5. Manipulate the expression to fit our needs: In step 4 we got some grasp of the function. Now, let’s put it into a context. Say we want to express {0,0,1,0,1,0,0,1}  using this function (let’s call it delta). If we would use this function, starting at k=0 to k=7 and setting n=41 and b=2, we would get {1,0,0,1,0,1,0,0} – the same thing but in reversed order. So, simply start the loop from k=7 to k=0 and the problem is solved, or start with k=0 to k=7 and change k=7-i, where i=0 to i=7.
  6. Express the problem with the newly introduced concept: The “sets” in the expression in step 2 can be expressed with our new function. In the statements below, it is assumed that the loop starts at i=0 and ends at i=4.
    • {1,0,0,1,1} = δ(19, 2, 4-i)
    • {0,1,1,1,0} = δ(14, 2, 4-i)
    • {0,0,1,1,1} = δ(7, 2, 4-i)
    • {0,0,1,1,0} =δ(6, 2, 4-i)
    • {0,0,0,1,1} =δ(3, 2, 4-i)

    Note, δ(3, 2, 4-i) can be written as δ(3, 2, i) if the loop starts at i=4 and ends at i=0. Therefore,
    Eqn7
    or if we reverse the loop
    Eqn8

Conclusion

Initially, this problem was all about to find a pattern in a recurrence. However, by trying to generalize the pattern in another way, new perspectives of looking at the pattern were found. Now, instead of stating the pattern with words, we can express it in a very condensed form. Moreover, the solution can be said to have several levels of abstractions. We have, in other words, modularized the problem into 1) a function that finds digit in base 2 – “the delta function” and 2) a common expression that uses the function to express the pattern. The group of 1 and 2 can be another abstraction level, i.e. we can now use this to solve other recurrences without going in to details of each individual part.

I think and hope that this way of looking at the problem will allow us to analyse different patterns more in details and find even more patterns that we have not thought about!

Additional links/resources

Ruby code

Mathos Parser is now an expression compiler

The requirement of performing several thousands of calculations (for example, during integration) led to the optimization with pre-scanned expressions. But the aim to make Mathos Parser even faster still remained. Now, I would like to introduce you to a new project of Mathos Project – Mathos Expression Compiler, based on the math tokenizer and parser logic of Mathos Parser.

This API allows you to parse an expression and “store” it as CIL code. Then, using ILAsm, you can generate a new assembly that will contain your expression entirely parsed. The only operation that occurs at runtime is the insertion of values into the variables that you declared and the operation on that variable value. To make it even faster, most of the calculations are already done before the runtime, for example, x+3+5 would be stored as x+8.

In this scenario, the time consideration can be important. Even if it is faster to execute CIL instructions, the compilation process can take some time. This has not been tested yet. The process is illustrated below:

a1

This is a very simplified way of looking at it but the point is to always test Mathos Parser together with Mathos Expression Compiler and make an estimate of how much time would be saved in different scenarios. It all depends on the context. But note, that

A performance hit is incurred only the first time the method is called. All subsequent calls to the method execute at the full speed of the native code because verification and compilation to native code don’t need to be performed again. //p. 12. Jeffrey Richter. CLR via C# 3rd Edition

So, if you plan to execute the function trillion times, it might be a good a idea to keep in mind the Expression Compiler as a better alternative, because at that point, the compilation time will be negligible.

Now, back to the features. It was mentioned in the beginning that it is based on the Mathos Parser. Note, in terms of features, Mathos Expression Compiler has at this point only a subset of those of Mathos Parser. At this point, you cannot have functions as sincos, and your own custom functions, but this will be changed in the nearest future.

There are also some changes to the way operators work. To declare an operator properly, you should use the variables OperatorListOperatorAction, and OperatorActionIL. Below, a simple example of addition operator:

Comparison operators are a bit more tricky. An example of greater than operator is shown below:

For those who understand IL, the @ sign might seem a bit strange. It is simply a way to tell the stack count to the if statement (in the compilation process).

When you actually generate an expression, this is how your IL code for that expression might look:

Will return (only IL for the expression):

When you have the IL code, you should name the output executable(or dll) as MathosILParser, otherwise, you might get an error. In future, an option will be added to customize the name of the output file, but at this point, please use RegEx (or replace option) to change the MathosILParser.exe to something else (in the IL code).

Your function can later by called as shown below. Note, I am not using reflection at all.

I hope this short guide was helpful, even if it might be a bit messy. If you have performed any testing related to speed or if you want to show some examples, please feel free to comment or contact me.

If statement in CIL

The article, Introduction to IL Assembly Languageis a great resource for those of you who would like to understand IL code (kind of like asm but in .net). IL really gives you a different perspective on computer programming, and you must consider more things in comparison to high level languages like C#.

Currently, I am studying conditional statements. Below, an example of a statement that checks whether the two values in the evaluation stack are equal to each other. 

By executing this, the result below would be shown:

ev

The advantages of pre-scanned expressions

RubyInt project, or Iron Ruby Based Mathos Core Library Interpreter was one of the projects to be tested with the new parser feature, that is, pre-scanned expressions.

In the Integral approximation module, it is quite good to use pre-scanned expressions, since the expression stays the same through the entire process. The execution time improved in this case.

The integral approximation website in Mathos Laboratory (see it here) has also shown good results thanks to this optimization. In Mathos Laboratory in particular, it is important that methods are quick and use as small amount of resources as possible, because it is hosted in the Cloud.

I have performed some quick tests using an i3 processor (the methods are in changeset 37964), although I would not recommend them to draw specific conclusions (there is a random error involved so more repetitions of each test are needed). Also, the difficulty level of an expression can affect the tokenization time. The graphs are located below:

mp2

mp_separatly

One conclusion that can be drawn from this is that the optimization only gets useful when the same expression is evaluated more than 100-500 times.

Faster evaluation using a pre-scanned expression

[This article was updated to included changes in v. 1.0.7.2. Originally written for v. 1.0.7.1]

Today, I published a newer version of Mathos Parser on both CodePlex and NuGet. The release notes:

from v.1.0.7.2:

  • Force user to use ReadOnlyCollection when using a token list as a parameter.

from v. 1.0.7.1

  • Added method GetTokens
  • Changed the structure a bit. Now, the Scanner returns tokens instead of continuing with MathParserLogic. This is good if the same kind of expression is to be used many times.

But what does this really mean? Well, previous versions of the parser worked more locally, that is, once one function was executed, it triggered another function, and so on. Now, however, the Scanner (private method) returns a list of tokens instead of launching MathParserLogic directly. If you simply want to get tokens, you would have to execute the GetTokens function, as shown below:

You can then send the tokens list into either Parse or ProgrammaticallyParse and it will parse it in the same way as if you would enter a string expression, that is:

Both statements above will return 32, but there is one main difference; the execution time will be so much smaller for the second statement, where a “pre-compiled” list is being used. If it would take 100 milliseconds (ms) to find the value of the first statement, it would take approximately 9-10 ms for the second one. The ratio is 10:1.

This is great if you want to perform an operation where the same function is involved several thousands times, for example when approximating an integral.

I am currently thinking about converting math expressions into IL code, but this will happen sometime in future!

Recorded a new video about Mathos Core Library Interpreter

The new generation of app protection

In order to emphasize the ability to control an application when online key validation is being used, I have created a poster. Serial Key Manager be found here.

skm_system_all_skgl

Keep everything under control

A new logo

It is important to have a logo to connect different things together (websites, blogs, programs).  After some hours of work I managed to create one:

artemlogo2

Although I had some other drafts, I think this is the one I am going to use!

Creating a programming language with Mathos Parser (2)

In this article, I would like to find out whether it is possible to construct a programming language with Mathos Parser. In general, the question is if a simple expression parser has the necessary things to easily convert it into a language.

The answer is yes, but the language will be very limited if additional features are not added. The reason for this is because what most languages have in common is some sort of  user/machine interaction, while an expression parser only allows you to parse mathematical things and is not really designed to interpret how to write to a file or allocate memory.

For example, imagine you want to create something similar to if statements  and you decide to treat an if as a function. You would use following code:

So, if whatever we pass into the first parameter is true, the value in the second parameter will be returned. This is quite cool, but it does not allow us to return a series of statements. Instead, we could design a new kind of if statement that would, depending on the result from this if function, execute the right method. Here is how I solved it:

As you can see, the if statement is still there, but there is now a feature that will interpret the result and make something happen. You can also use \n to separate between different statements and also name a collection of statements, as shown below:

Note that both examples use a function get(). It can also be implemented as a function:

You might have noticed the use of statements (see notTrue statement). Their role is to simulate a method. Everything you enter into a statement will not be executed until you call it, so you can insert undefined variables into a statement when you define it and later assign them with a value. I have also added commands like write, title, pause, clear and end. These are extensions to the language and are interpreted in the method below:

Some programmers will certainly argue that the code above is not that optimized and is quite primitive. Well, it serves its purpose and makes Mathos Parser look more like a language.

In the next article I want to implement a while feature into this awesome programming language.

Conclusion: Just because something can parse expressions does not mean it can parse a series of statement. In order to fix this we can either choose to add some features externally, or, go into the parser itself and make it more adjusted to this particular task. This language does not alter Mathos Parser in any way.

Downloads: