Tag Archives: c#

On strings, methods, return variables and IL code

Some days ago reviewing some old code I found out that a method was performing an operation with a string passed by arguments, storing the result in the same variable, and returning it at the end.

It looked weird, so I wanted to know what was really happening, and to check if there is really any difference between smashing the variable sent by argument, creating a new variable or directly returning the call result.

For that reason, I’ve created a small sample project, and later, using ILDasm, saw what was really under the hood. ILDasm is a disasembler for the Intermediate Language created by the CLR when we compile C#.

Just before we start, some quick notes:

  • IL looks like some sort of assembly-like language, in the way that it works with a call stack,  and the result of a function call is stored on the stack before returning.
  • The values are index based, so when we are executing ldarg.0, we really are operating with the value located in the index 0
  • The result of calls to external methods is also saved on the stack.
  • The IL is not the bytecode that will execute, this code is interpreted at runtime by .NET, so the final code result may be sightly different.

And here is the code!

class Program
{
    public static string MyFirstCustomFunction(string a)
    {
        a = a.Substring(4);
        return a;
    }

    public static string MySecondCustomFunction(string b)
    {
        return b.Substring(4);
    }

    public static string MyThirdCustomFunction(string c)
    {
        var result = c.Substring(4);
        return result;
    }

    static void Main(string[] args)
    {
        Console.WriteLine(MyFirstCustomFunction("Lorem ipsum dolor sit amet"));
        Console.WriteLine(MySecondCustomFunction("Lorem ipsum dolor sit amet"));
        Console.WriteLine(MyThirdCustomFunction("Lorem ipsum dolor sit amet"));
    }
}

Let’s start with the first method, if we launch ILdasm from the VS command promt, and we load the executable generated (located in /bin/Debug from our project folder), we will get this image:

ildasm

Here we can see the IL for the first method:

.method public hidebysig static string  MyFirstCustomFunction(string a) cil managed
{
// Code size       16 (0x10)
.maxstack  2
.locals init ([0] string CS$1$0000)
IL_0000:  nop
IL_0001:  ldarg.0
IL_0002:  ldc.i4.4
IL_0003:  callvirt   instance string [mscorlib]System.String::Substring(int32)
IL_0008:  starg.s    a
IL_000a:  ldarg.0
IL_000b:  stloc.0
IL_000c:  br.s       IL_000e
IL_000e:  ldloc.0
IL_000f:  ret
} // end of method Program::MyFirstCustomFunction

What we are watching can be resumed in the following points:

  • At the beginning we define a variable that matches the return type specified in the header. This variable, placed on the 0 position, will contain the return  of the method.
  • Afterwards, we load the arguments in the stack, in this case a single argument.
  • Before calling the substring function we must load into stack the other argument, a 4 byte integer of value 4.
  • Then we call the substring method, specifying both the assembly and the full namespace that contains the String class. The result of that call will be stored back into the stack.
  • After the call we retrieve the stack value and we place it back into the argument variable, replacing the existing object.
  • We read again the value from the argument to the stack and we store in the local variable 0, the return variable.
  • Finally, before returning the function, we place the return variable value on the stack, so it can be accesed from the caller method.

There are some calls like the br and the nop, that are related to how, in debug mode, extra instructions are added to the program for better step-by-step debugging, and there is a discussion on Stack Overflow about the subject, that is linked at the end of the article.

As we can see here, we are loading and storing the same value repeated times, and that may not be necesary at all.

Let’s jump into the second method:

.method public hidebysig static string  MySecondCustomFunction(string b) cil managed
{
// Code size       13 (0xd)
.maxstack  2
.locals init ([0] string CS$1$0000)
IL_0000:  nop
IL_0001:  ldarg.0
IL_0002:  ldc.i4.4
IL_0003:  callvirt   instance string [mscorlib]System.String::Substring(int32)
IL_0008:  stloc.0
IL_0009:  br.s       IL_000b
IL_000b:  ldloc.0
IL_000c:  ret
} // end of method Program::MySecondCustomFunction

As we can see It begins in the same way, but after calling the substring method the result of the method call is stored from the stack to the result variable, with no extra copying and no information smashing.

This looks like a more efficient way of working, because we save an extra Read/Write operation.

Let’s see what happens in the last case, using a extra variable defined inside the scope of the function, what would happen?

.method public hidebysig static string  MyThirdCustomFunction(string c) cil managed
{
// Code size       15 (0xf)
.maxstack  2
.locals init ([0] string result,
[1] string CS$1$0000)
IL_0000:  nop
IL_0001:  ldarg.0
IL_0002:  ldc.i4.4
IL_0003:  callvirt   instance string [mscorlib]System.String::Substring(int32)
IL_0008:  stloc.0
IL_0009:  ldloc.0
IL_000a:  stloc.1
IL_000b:  br.s       IL_000d
IL_000d:  ldloc.1
IL_000e:  ret
} // end of method Program::MyThirdCustomFunction

The first notable difference is in the local variable definition, that defines a second string variable that will hold our intermediate value.

The main difference between here and the first function is that no extra calls to the arguments are done, but, as we are saving the result in a variable before returning it, we have the same double Read/Write problem from the first case.

To sum up, if we directly return the result of a function instead of assigning it to a variable, we will avoid double Read/Write. The third option, while looks interesting, defines another variable, and more memory allocation.

Further reading

Edit:

Advertisements

TinyBmp: A small program for generating images

Ver este artículo en castellano aquí

We have recently bought a new LED display like those you can find in the train stations for keeping a control of the current release number (important when you are developing super-secret features). The screen contains a 21×7 matrix, so there is not too much to do.

The good news is that the screen supports rendering of bmp images, so I figured that creating a program for generating this kind of images could be fun, and that’s what you can find in TinyBmp:

It contains three different projects in C#:

  • A class library that does the magic.
  • A desktop windows app.
  • A console app, for automating the image generation every time we ship a release.

This is how the tool looks like:

And this is how the output looks like, a 21×7 image in bitmap format (bmp).

foo

I know that we can generate images with C# from text natively, but is not the greatest choice for low resolution problems because the numbers get disorted. In this aproach I’ve generated the number manually, having a 2×4 matrix for each number pixel per pixel.

This has some drawbacks, for example, 0 and 8 are represented by the same character, but I’m working on it.

You can find the code in https://github.com/rlbisbe/tinybmp under MIT license.

If you have any issue or improvement idea, please let me know, or if you hacked the code and found an improvement, send me a pull request!