Artem's blog

Mainly .NET (C#, ASP.NET) and my projects

Archives for University

General thoughts about time management

[Originally published at https://www.kth.se/profile/204929/page/general-thoughts-about-time-management/]

Being able to plan your studies and time in general is important to be productive. But why do people really do it? Why do we want to stay productive? One possible answer to this is the realization of the tasks that have to be done and the time that is at our disposal. Once the number of tasks increases, we soon realize that time is a scarce resource and has to be used wisely. In this post, we are going to look at a purely tactical approach, how it can be affected by time pressure, and finally how the university has contributed to a combination of both tactical and strategical thinking and how this is related to knowledge from computer science.

Throughout the high school and the first year at the university, I’ve been really into time management. Back then, I think I viewed the planning from a tactical perspective, i.e. during the day I had a set of tasks (such as homework) that were split up into 45 minute sessions, accompanied by 15 minutes break, later by 30 minutes break, and later I switched the subject, even if I had things left. This was suggested by my ToK teacher Ric Sims. I must admit that I did not follow it all the time, but at least I had the time structured into intervals with breaks. At that time, my approach to homework was pragmatic (as far as I can recall), i.e. I prioritized them and executed them as some sort of TODO list.

Later, once the finals were approaching, I started to do other tasks such as doing past papers, rewriting the syllabus (this was not a good idea I think, but allowed me to memorize key concepts), and so on. Suddenly, as they got closer, I had detailed plans of what has to be done during a certain month, a certain week, and so forth; strategical thinking was emerging. For the finals, I came up with a detailed schedule to utilize as much time as possible (the Excel file: Finals approaching). To sum up, as time gets scarce, you tend to plan more to ensure that you use it effectively.

Now, at the university, planning gets even more important. I tend to plan several months and even years ahead (by specifying goals, a strategy). For instance, last year, my plan was to focus onMulti Variable Caclulus, a subject that I was not supposed to study in year 1. Now, my plan (this year) is to focus on my project – Serial Key Manager – and apply to Student Inc. Once the direction (strategy) is specified, it is easy to move forward by breaking the goals into smaller ones, and ultimately have a concise list of tasks; solving these is a question of tactical nature. As an example, this week, I tried to write down all the tasks in an Excel sheet, prioritize them, and complete them one by one (see fig 1).

Excel as a way to prioritize tasks

This gave me a clearer picture and allowed me to organize my time more effectively. This weekly plan however, can be generalized to a monthly plan, a yearly plan and so forth as well as broken down in to smaller components that take time into consideration.

Tactics

Some computer scientist will point out that this kind of reasoning is related to modulation and abstraction. That is, the yearly plan can be thought of like Python, the monthly plan as Java, the weekly plan as C++ and the daily one as Assembler. It’s interesting that we can combine our knowledge from different disciplines to achieve great results!

Surely, planning is crucial to a successful use of our time. But, it’s also important to understand that different planning approaches have to be applied depending on the type of planning (such as strategical or tactical thinking). All of this helps us to cope with tasks considering that time is a scarce resource. However, we should always remember that for a good execution of tasks, we need to be balanced (don’t overestimate) and reflective (try to find a strategy that works best for you)!

Plan for future article: Next time, my intention is to look at ways to manage the time when you are executing a task. What other alternatives are there(such as Pomodoro technique)? How do we reduce procrastination? But for now, please let me know your thoughts about the current post! 🙂

The Beauty of Linear Algebra – Summary

For almost a period I’ve had intense Linear Algebra (some days 4-6 hours), a subject that does not need that much prior learning to be able to understand. You need to know addition and multiplication, that’s it. Yet, the power of linear algebra in real world problems (and highly theoretical problems) is great. The aim of this post is to give a short overview of the subject, summarizing basic concepts.

Introduction

To translate problems into matrix equations can be quite useful. It can allows us to perform simple things as balancing chemical equations (use of Ax=b), take derivatives of polynomials, approximate solutions to overdetermined systems, rotate objects in any space, find volumes, interpret shapes, and so much more. If we think about the very basic concept – matrix multiplication – we simply follow certain rules (a definition) to find the answer. It is quite interesting that we can use this fact to “store” information in matrices, that is, a certain set of actions that have to be performed given a rule. An example of this is differentiation of polynomials (see the ToK marked with blue). Below, I’ve included some examples I constructed using MATLAB.

Euclidean Vector Spaces

Since this is simply a subset of General Vector Space and Inner Product Spaces, the interested reader is advised to read those sections instead.

General Vector Spaces

In this section we were introduced to the concept of vector spaces; basically generalizing euclidean vector spaces (3d). Below, some of the questions and focuses:

  • How do we find the shortest distance between two lines in n space? Answer, shortest distance is the perpendicular one (by Pythagoras thm). When I am concerned with such problems, I simply pick a general point on the lines, find the line that goes through these points (this expression will contain both \(s\) and \(t\)), and use the power of calculus to find the shortest distance. Note, the calculus approach requires knowledge of multivariable calculus. Otherwise, remember that when dot product is zero, vectors are perpendicular (by definition).
  • What is a vector space? It might come as a surprise, but vector spaces are not only about vectors. We can have a polynomial vector spacematrix vector space, etc. The good thing here is that if we can prove that “something” is a vector space, then we are able to apply theorems for general vector spaces on these “something” vector spaces. We should prove that a) closed under addition (\(\vec{u},\vec{v} \in V \implies (\vec{(u+v)}\in V)\ \)), i.e. by adding vectors we never leave the vector space V, and b) that it is closed under scalar multiplication, (\(\vec{v}\in V \implies k\vec{v}\in V, k \in \mathbb{R}\)). A trick is to set the scalar to zero to show that something isn’t a vector space. Lastly, the beautiful thing about mathematics is that many times it’s all about definitions. If I want, I can come up with addition that behaves as multiplication, etc. It’s up to you to decide (we will later see in Inner product spaces that we can define the “dot” product also).
  • A subspace? As the name suggests, it’s a vector space that is inside another vector space. Same applies to these also; they should be closed under addition and scalar multiplication.
  • Linear dependent/independent? This is all about being able to express a vector as a linear combination of other vectors. If no vector in a vector space \(V\) is expressible as a linear combination of other vectors in \(V\), then these vectors are linearly independent. Ex. If a vector space is spanned by vectors \(\{v_1,v_2,v_3\}\), then if we are unable to express these vectors in terms of each other, they are said to be linearly independent. Otherwise, they are linearly dependent. Recall cross product. We used it in “normal 3d space” (or more academically, orthonormal system with 3 basis vectors) to find a vector that is perpendicular to both vectors. NOTE: Linear independent vectors don’t need to be perpendicular, which is the case with cross product.
  • Basis? Above, we saw the use of basis without a proper definition. Not good. Basically, a basis is what makes up a vector space. It’s a set of vectors that are required to be a) linearly independent and b) they should span V (all vectors in V should be expressible as a linear combination of the basis vectors). To test if a basis is linearly independent, simply thrown them in into a matrix (in any direction), and evaluate the determinant. If it isn’t equal to zero, the vectors are said to be linearly independent.
  • Column space/row space/null space? Ok, we are getting into highly theoretical grounds (at least at this stage, it might seem so). Given \(A\) is \(m\times n\) matrix. Column space is the subspace of \(\mathbb{R^m}\) that is spanned by the column vectors of \(A\). The row space, as the name suggests, is the sub space of \(\mathbb{R^n}\) that is spanned by the row vectors of \(A\). The null space is the solution space (recurring definition) of \(Ax=0\). A great theorem states that a system of linear equations \(Ax=b\) is consistent if and only if \(b\) is in the column space of \(A\) (\(b\) is expressible as a linear combination of column vectors of \(A\)). “The proof is similar to [..] and is left as an exercise to the reader“(book).
  • Theorems? Yes, it turns out that there is a formula that relates the dimension of the column space (\(rank(A)\)) and the dimension of the null space (\(nullity(A)\)). For a matrix with \(n\) columns, we have that \( rank(A)+nullity(A)=n\). I would like to point out that dimension of the column space is equal to the dimension of the row space.
  • Orthogonal complements? A good relationship exists between the row space and the null space, and that is that they are orthogonal complements of each other. Orthogonal is a fancy way of saying perpendicular.
  • Is it possible to change basis vectors? Yes, this is perfectly fine. We use a matrix \(T\) that will function as a converter (ToK: Discuss the importance of representing of a set of instructions in a more abstract form). Here’s the relationship, $$(\vec{v})_{B’}= {}_{B’}{T_B}(\vec{v})_{B})$$If we know the basis vectors of \(B’\) and \(B\), we can get the transition matrix by putting the basis vectors as columns in \([B’|B]\) and perform elementary row operations until we get to \([I|{}_{B’}{T_B}]\). In general, $$[new|old]\sim \text{el. row op.}\sim [I|old\to new]$$
  • How does the notion of functions apply to vector spaces? From high school, many of us should be familiar with the fact that a function maps one value to another value. This can be applied to vector spaces. Again, it’s always good to have a solid definition. We say that \(T:V\to W\) is a function from vector space V to W, then T is a linear transformation if the following criteria is fulfilled: a) \(T(k\vec{v})=kT(\vec{v})\) and b) \(T(\vec{u}+\vec{v}) = T(\vec{u})+T(\vec{v})\). The good thing about being able to transform from one vector to another is that when this is put into a computer, we can do all sorts of cool things. For example, we can reflect, project, and rotate objects. We can also contract and dilate vectors, and this can be expressed in matrix form. Sometimes, we might want to perform two things at once, reflect and rotate maybe, then, we simply multiply the transformation matrices together in this order: rotate*reflect. Always think from right to left.

Inner Product Spaces

The fact that it is possible to generalize the notion of vector spaces suggests that the same can be done for operations that are performed inside a vector space.

  • What is an Inner product space (real vector space)? By definition, this is a vector space where we have an inner product with following properties.
    1. \(<\vec{u}|\vec{v}>=<\vec{v}|\vec{u}>\)
    2. \(<\vec{u}|\lambda\vec{v}>=\lambda <\vec{u}|\vec{v}>\)
    3. \(<\vec{u}|\vec{v} +\vec{w}>= <\vec{u}|\vec{v}> <\vec{u}|\vec{w}>\)
    4. \(<\vec{u}|\vec{u}> \ge 0\) and \(<\vec{u}|\vec{u}> = 0 \iff \vec{u}=0\)

    This is essentially the dot product in \(\mathbb{R^n}\). The good thing about generalizing it is that “something” does not necessarily need to be in \(\mathbb{R^n}\) to be an inner product space. For example, we can have an inner product space of all continuous functions (denoted by \(C[a,b]\)). Then, the inner product is defined as: $$<f(t)|g(t)>\int_a^b f(t)g(t)dt$$Since polynomials are continuous functions, this definition applies to polynomial inner product spaces.

  • What is perpendicularity (orthogonality)? In a a vector space with an inner product \( <\vec{u}|\vec{v}>\), the angle between \( \vec{u}\) and \( \vec{v}\) is defined as $$cos \theta = \frac{{ < \vec u|\vec v > }}{{||u||||v||}}$$Cauchy-Schwarz identity is quite useful and is given by: \(<\vec{u}|\vec{v}>^2\le ||\vec{u}||^2||\vec{v}||^2\)
  • How to project a vector on a subspace of an inner product space? Since we have a generalized version of the dot product, it’s possible to generalize the projection of a vector on a subspace. Here’s how: Given an orthogonal (this is crucial) basis for \(W\in V\), say \(\{ {{\vec v}_1} \ldots {{\vec v}_n}\} \) and that \(\vec{u} \in V \), then: $${{\mathop{\rm Proj}\nolimits} _W}\vec u = \frac{{ < \vec u|{{\vec v}_1} > }}{{||{{\vec v}_1}|{|^2}}}{{\vec v}_1}  +  \ldots  + \frac{{ < \vec u|{{\vec v}_n} > }}{{||{{\vec v}_n}|{|^2}}}{{\vec v}_n}$$
  • How to find an orthogonal basis? Note that the projection on “formula” only works if the basis is orthogonal. In order to find an orthogonal basis, we can use Gram-Schmidt process.$$\begin{array}{l}
    {\rm{Step 1: }}{{\vec v}_1} = {{\vec u}_1}\\
    {\rm{Step 2: }}{{\vec v}_2} = {{\vec u}_2} – \frac{{ < {{\vec u}_2}|{v_1} > }}{{||{{\vec v}_1}|{|^2}}}{{\vec v}_1}\\
    {\rm{Step 3: }}{{\vec v}_3} = {u_3} – \frac{{ < {{\vec u}_3}|{{\vec v}_1} > }}{{||{{\vec v}_1}|{|^2}}}{{\vec v}_1} – \frac{{ < {{\vec u}_3}|{{\vec v}_2} > }}{{||{{\vec v}_2}||}}{{\vec v}_2}\\
    \vdots \\
    {\text{r times}}
    \end{array}$$ Remember that $$ \vec u = {{\mathop{\rm Proj}\nolimits} _W}\vec u + {{\mathop{\rm Proj}\nolimits} _{{W^{\bot}}}}\vec{u}$$
  • How to find the “best” solution in an over-determined system? A good approach is to apply Least Square method. Over-determined systems occur in scientific measurements, and so if the error is assumed to be normally distributed, we can use the method of Least Squares. The idea is to think about the system’s column space and then try to project the measurements on the plane that the column space spans. It turns out that the best solution is $$x = {({A^T}A)^{ – 1}}{A^T}b$$ Keep in mind that it’s not required to have an orthogonal basis (i.e. the column space does not have to be orthogonal). Using this, we can express the projection formula as a simple matrix multiplication $${{\mathop{\rm Proj}\nolimits} _L}\vec u = A{({A^T}A)^{ – 1}}{A^T}b$$An interesting property that can be obtained is that \(A^TA\) is invertible if and only if the column space of A is linearly independent.

Linear Transformations

I think the Swedish term linjär avbildning conveys more information rather than just stating linear transformation. The term basically means linear depiction, which is what this topic is about. At the first sight, this might seem as converting from one base to another. This isn’t entirely true, I’ve realized. It’s more appropriate to view this as a function that maps one value to another. As functions can be injective, surjective, bijective, it’s not implied that we should always be able to map a transformed point back to the original point. In contrast, when changing bases, it’s always possible to get from one basis to another; you never really introduce/remove more information.

  • How to define Linear Transformation? A linear transformation \(A: V\to W\) should have the following properties:
    1. \( A(\vec {u} + \vec {v}) = A(\vec {u}) + A(\vec {v}) \)
    2. \(A(\lambda \vec u) = \lambda A(\vec u)\)
  • What is Kernel and Image space(range)? If you’ve read what I wrote in General Vector Spaces, the Kernel = Null space and Image space = Column space.
  • Example? Let’s define the following linear transformation $$\begin{array}{l}
    A(1,2,0) = (1,1,1)\\
    A(0,1,2) = (0,1,1)\\
    A(1,3,3) = (0,0,1)
    \end{array}$$Unfortunately, this is not so useful if we want to transform a vector with the matrix A. It is easier if we would have the standard basis vectors on the left hand side. So, similar to the change in basis example, we put this into a matrix try to get an identity matrix on the left hand side. This method (Martin’s method) was discovered by Martin Wennerstein 2003. See the formal explanation of the method.
  • Relation between column space of composite linear transformations? Given that \(B\) has full rank, i.e. \(rank(B)=n\), then \(rank(BA)=rank(A)\). Motivation: This is obvious if we consider the transformation \(BA\) means. First, we perform transformation \(A\), which will lead us to the image of A (\(range(A)\)). That is, all vectors will be depicted on the image space of \(A\). When transformation \(B\) is performed, all vectors in the image space of \(A\) will be transformed using \(B\). Note, \(B\) kernel is trivial, i.e. the zero vector transforms to zero vector.
  • What’s special about the kernel? It can be useful to think of kernel as “the information that gets lost during a transformation”(KTH student). For example, if the kernel contains more than just a zero vector (i.e. it’s non-trivial), when we perform a transformation, some vectors will be depicted on the zero vector; thus “information” disappears.
  • What is injectivite, surjective, and bijective? Let \(A:V \to W\). Injective (one-to-one) is if \(\vec x \ne \vec x’ \implies A(\vec x) \ne A(\vec x’)\), i.e. each vector is represented by a unique vector (so we can map back to the original vector after a transformation). Surjective (onto) is if \(\forall \vec y \in W\exists \vec x \in V:\vec y = A\vec x\), i.e. for all vectors in the image of \(A\), we can map it to the original vector. Bijective is when it’s both injective and surjective.
  • Properties? For something to be injective, the dimension of the kernel has to be zero. For \(A: V\to W\) to be surjective, \(dim ker(A)=dim(W)\). This can be proved by the dimension theorem and the definition of surjectivity.

Eigenvalues

An eigenvalue is the value lambda in \(A\vec x = \lambda \vec x\). It can be, for instance, applied in problems such as a) Find $$ {A^{1000}}\left( \begin{array}{l}
2\\
1
\end{array} \right) $$given that we know A, or b) express \(2x_1^2+ 2x_2^2+2x_3^2+4x_1x_2\) (see this) without the cross product terms. The latter is particularly good when identifying shapes that have been rotated/transformed.

  • How to find eigenvalues? Simply solve \( \det (A – \lambda I) = 0\), the characteristic equation.
  • How to find the eigenvectors corresponding to eigenvalues? Solve \((A – \lambda I)\vec x = 0\).
  • Express a matrix using eigenvalues and eigenvectors? We can always express a matrix A as \(A = PD{P^{ – 1}}\). If P happens to be orthogonal (i.e. the rows/columns space forms an orthonormal base), then we can express it as \( A = PD{P^T}\) because of a property of orthogonal matrices that states that \(A{A^T} = I\) or equivalently that \( {A^{ – 1}} = {A^T}\).
  • Raising matrices to a certain power? It can be shown that \({A^k} = P{D^k}{P^{ – 1}}\).
  • Applying orthogonal deagonalization on quadratic form? $$\begin{array}{l}
    a{x^2} + b{y^2} + c{z^2} + d{x_1}{x_2} + e{x_1}{x_3} + f{x_2}{x_3} = \\
    = ({x_1},{x_2},{x_3})\left( {\begin{array}{*{20}{c}}
    a&{d/2}&{e/2}\\
    {d/2}&b&{f/2}\\
    {e/2}&{f/2}&c
    \end{array}} \right)\left( \begin{array}{l}
    {x_1}\\
    {x_2}\\
    {x_3}
    \end{array} \right)
    \end{array}$$Now, this can be seen as \(x^TAx\). If we make a substitution \(\vec x = P\vec y\) such that \(P\) orthogonaly diagonalizes \(A\), then $$ {x^T}Ax = {(P\vec y)^T}A(P\vec y) = {y^T}({P^T}AP)y$$
  • Orthogonal matrices? In an orthogonal matrix, the row vectors form an orthonormal base (same for column vectors). An interesting property that exists when multiplying a vector \(x\) is $$||A\vec x||=||\vec x||$$ and $$A\vec x * A \vec y=\vec x* \vec y$$Then, as we mentioned previously, the inverse of an orthogonal matrix is simply the transpose (please see my answer for a possible application).

A book for young computer scientist

Today, I published one of my summer projects – a book (a chapter) – as open-source, on GitHub. The reason why it’s open source, the purpose and goal, and finally how I got started is described below.

Why open source?

The initial aim of the book was to make computer science mathematics more accessible, which shaped it quite a lot. First of all, the programming language used is JavaScript, which I believe is the one many people have access to. Not all languages are like that, and some you have to pay for. JavaScript, on the other hand, can be access on all machines (computers, phones) that have access to a web browser. Secondly, the book can, from now on, be accessed free of charge. The code (written in TeX) is also available. I hope that this will allow an even greater audience able to access it. Thirdly, each chapter (right now it’s only one) should describe the subject from scratch. It’s not assumed that you should have some prior knowledge to be able to understand it.

Purpose and goal

I’ve partly described it above, but there is more to it. There are two groups the book should target: young programmers and non-programmers. Regarding the first group, my experience tells me that there is a tendency that many young programmers skip the underlying concepts of mathematics behind some computer operations and instead focus on the code (in some cases, you don’t even need to code that much). The latter group will still find many concepts interesting and my goal is to show that computer science is fun and not as scary as they might think.

How it started

The first time I came across this idea is when I was contacted by an editor at Apress in the beginning of November 2013. Before that, I recently published a short course reference for Algorithms via C# course, which I haven’t had the opportunity to start properly (see KSDN).  After a consultation with one of my teachers, I decided to keep focused on the diploma to get good grades and later on the book. So, sometime in June I started writing the chapter about modular arithmetic that I thought I knew a lot of about. But, when I started writing I had to continue doing research and learn the concepts (from several sources) and later write it down. Since I want to write it in my own phase (when I have time) and want people to have access to it, it’s not published as a traditional book but rather as a digital one.

More information

“Trits” instead of “bits” – a short introduction to balanced ternary

It’s usually said that students who would like to study computer science should familiarize themselves with binary representation of numbers and thus get used to very common powers of 2, i.e. 2¹=2, 2²=4, 2³=8, etc. That’s definitely true; the digital world we live in uses bits and almost all programming languages are based on Boolean algebra. But, just because something is popular does not mean it’s good. In this short article, my aim is to give the reader a general understanding of the concept of balanced ternary, the benefits of using it, and finally some history of earlier attempts to create ternary computers.

Intro

The concept Balanced ternary has many underlying ideas, so let’s try to break it down into things we can relate to. The word ternary, in this context, refers to ternary numeral system. A numeral system is a way different people express numbers: the Great Greeks used a superscript on letters to identify numbers, i.e. α’, β’, γ’. In the Roman Empire, they used I, II, III, IV (we still use these numbers in names, i.e. Charles XII). Many people nowadays use numbers like 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 (origin from Hindu-Arabic numbers). This way of writing numbers can be referred to as base 10 or simply decimal. A ternary numeral base is similar to decimal, but allows only 0, 1, 2. In binary (base 2), we only allow 0, 1.

Decimal (Base 10) 0 1 2 3 4 5 6 7 8 9 10
Ternary (Base 3) 0 1 2 10 11 12 20 21 22 100 101

A clear observation from this table is that once digits end (i.e. going from 9 to 10), we put a zero and carry a 1 forward (in the number). This is similar to base 3, however, since we only have 0, 1, 2, our digits end faster than in base 10.

There exists an algorithm to convert between different bases, but that would probably be better suited for another article. Therefore, it’s not going to be discussed here (convert numbers). However, it’s quite simple to convert from base 3 to base 10 by an interesting observation. For example, 123 is equivalent to saying that 123= 1*10²+2*10¹+3*10⁰ = 1*100+2*10+3. In base 3, we have a similar approach, but instead of 10, we use 3. Say we want to convert 22 (in base 3) to base 10. Therefore, 2*3¹+2*3⁰=2*3+2=8 (in base 10). This conclusion goes hand in hand with the one in the table.

Now, we should have some grasp of ternary numeral systems. Let’s move on to the term balanced. A intuitive way of thinking about it that something should be balanced, like masses on opposite sides of a weight should cancel out. So, how would such system look like? I turns out that instead of saying that we can use digits 0, 1, 2 we instead introduce -1 so that the digits in the balanced ternary system are -1, 0, 1 (looks to be balanced?). Let’s refer to the -1 as n and 1 as p. The table below links the numbers in base 10 with the ones in balanced ternary.

Decimal (Base 10) 0 1 2 3 4 5 6 7 8 9 10
Balanced Ternary (Base 3) 0 p pn p0 pp pnn pn0 pnp p0n p00 p0p

There is a similar pattern here as for unbalanced ternary. The interested reader should consider to experiment with this concept by implementing it in the favourite programming language (see all implementations).

Benefits

Here are some of the benefits of the balanced ternary:

  • The sign (+/-) is stored in the number, and can be deduced by the leading trit (similar to bit). So, if a number starts with -1, or using our notation, n, we know it’s negative.
  • In order to get the number with the opposite sign, simply replace all n‘s with p‘s and all p‘s with n‘s. Eg. Since 7 is pnp, -7 is npn.
  • In order to round to the nearest integer, the fractional part should be removed.
  • Things don’t have to be either true or false. There exists an unknown case also.

History

It might seem that this idea works in theory but not in real computers. However, this is not true. At Moscow State University, a series of Setun computers was developed. The first Setun was built in 1958. An interesting thing that can be noted is that it used ternary logic.

Further reading

For those who are interested, please take a look at the links below:

Edits:

  • Corrected the base10-base3 table.
  • First published 20.12.2014 (dd/mm/yyyy)