Tuesday, March 25, 2008

Object Initializers and LINQ


I first saw the C#3.0's "object initializers" in March 2007, at the Microsoft MVP Summit in Redmond, WA. The feature didn't strike me as a big deal. Ok, you can save lots of lines of code
and make your code more readable, but I thought, that's about it.
After some time learning about LINQ, lambda expressions, expression trees, etc, I now realize that object initializers are a fundamental and important part of C# 3.0.
They aren't just syntactical sugar. Take a look at an example:

Class Person is a plain old class:

public class Person

{
public string FName { get; set; }
public string Country { get; set; }
}



Before C# 3.0, in order to create a list of Person and populate with 5 new Person with the properties initialized, I would have do the following:

List people = new List();
Person p1 = new Person();
p1.FName = "Mary";
p1.Country = "United States";
people.Add(p1);
Person p2 = new Person();
p2.FName = "Raul";
p2.Country = "Argetina";
people.Add(p2);
Person p3 = new Person();
p3.FName = "Sergio";
p3.Country = "Brazil";
people.Add(p3);
Person p4 = new Person();
p4.FName = "Giuseppe";
p4.Country = "Italy";
people.Add(p4);
Person p5 = new Person();
p5.FName = "Jean";
p5.Country = "France";
people.Add(p5);


With C# 3.0 we can initialize a collection of Person like so:

var people = new List {
new Person { FName="Mary", Country="United States" },
new Person { FName="Raul", Country="Argentina" },
new Person { FName="Sergio", Country="Brazil" },
new Person { FName="Giuseppe", Country="Italy" },
new Person { FName="Jean", Country="France" } };



As you can see, using object (and collection) initializers results in a much more compact and readable code. Notice that the code in between the {} is actually an expression. Therefore, we can say that object initializers give us the "ability to initialize an object in an expression context".

Now, let's see how this feature relates to LINQ. I am assuming you know what LINQ is and have at least seen LINQ queries. The result of a LINQ query is a "projection", or in other words, it is a brand new object created on the fly. The structure of the object is really unknown, and that's the reason the "var" keyword is so important in LINQ (see Anonymous Types).

You can project the entire object as is:

var FrenchPeople = from p in people
where p.Country == "France"
select p;


Or project a completely different object. In the case below, I am projecting a string (Person's first name) preceded by "Mr. ":

var FrenchPeople = from p in people
where p.Country == "France"
select "Mr. " + p.FName;

When you do these projections, all you're really doing is using an "expression that creates a new object out of existing objects".

The point I am trying to make is that, when we use expressions like these in LINQ, we are inherently using object initializers. Without them, projections in LINQ the way we know them would be impossible. The whole LINQ feature would be a lot clunkier and the code would look a lot messier.

Wednesday, March 5, 2008

Implicitly-typed variables in Resharper 4.0

By now, most of us already know that Linq is everywhere, and therefore, also are anonymous types. To make anonymous types usable with Linq, implicitly-typed variables are required.

The "var" keyword tells the compiler to infer the type from what's on the right-side of the attribution. C# is a statically-typed language, and the “var” keyword doesn't change this.

When you compile the code into IL, you'll see the type explicitly used there. So what's the harm of using "var" all over the place?

Before jumping to the answer right away, I would like to refer you to Steve McConnell's Code Complete, where he reminds us that one should strive to write code that's easy to read. it sure is nice when code is both easy to write and read (like the newly added "automatic properties" C# 3.0 feature). But readability always prevails.

So, to answer my question about the harm of using "var" everywhere, I would say that it can be really harmful to readability.

Sure, if a variable declaration is like the following, there's no problem in using "var":


var baby = new Person();


It is obvious that the variable baby is of type Person. But imagine some method, badly named GetNewOne, returning a new Person object.


var baby = GetNewOne();


How in the world would you know that baby is of type Person? I know if you're using Visual Studio, you can simple hover the mouse over it. But I think I shouldn't rely on a tool to provide me with code readability.

In the example above I definitely want my code to be like this:


Person baby = GetNewOne(); //TODO: Refactor this method name please!!!

I've been playing with ReSharper 4.0 nightly builds on Visual Studio 2008 targeting .Net 3.5, and interestingly enough, every time you use something like "Person baby = GetNewOne()", ReSharper 4.0 will put a squiggly line small green line (hint) under Person, suggesting I should use the "var" keyword instead. See below.



ReSharper 4.0 is not out yet, and maybe (hopefully) this code suggestion will not be in the final version, but I thought it was interesting and wanted to share.