Culture-aware business objects

In the past I have had to deal with globalization many times. Strangely I haven't been able to decide on a truly satisfying design pattern that works in all scenarios. In this case, I'm not referring to static labels and stuff like that, which you'd probably want to push into resource files, but rather dynamic text elements.

To clarify my intent, a assume this simple scenario. We have a Product object, which has the following definition:

class Product
{
	public Guid Id { get; set; }
	public string Name { get; set; }
	public string Description { get; set; }
	public double Price { get; set; }
}

Extremely simple, right? Ok, so now suppose that we have some sort of administrative interface where a website's administrator can manage products (CRUD). All is still well ...


But, then suppose that the company that owns the website, decides to go global. They want to sell products world-wide and thus need to accommodate localized descriptions of their products. The issue we are then faced with, making the string values culture aware can be tackled in various different ways. to name a few:

  • We could leave our object model as is, and just modify the datastore to persist the same Person object for each culture we want to support;
  • We could break our Person object into two separate objects (culture aware & culture unaware) and append a list of the aware objects to the unaware (base) object. obviously, we should modify our data layer accordingly;
  • We could wrap the culture specific fields in a newly defined type;

In the past, I've had most success with the second option and was forced to use option 1 in some scenarios where I had to work with some legacy code. The reason I don't like option 1 is that it makes for odd and de-normalized persistence. Also, I don't really like option 2, as it causes complex(er) object models. You get a lot of 'noise'. Moreover, it tends to result in hard(er) to read code. So, during my last globalization endeavor, I decided to explore option 3. 

In my case, specifically, only the string values were to become culture aware and persistence was done using XML Serialization techniques (no database). I ended up creating the following simple wrapper class:
 
public class CultureNameIndexedString : SerializableDictionary
 {
        /// 
        /// Sets the value of the specified string in the specified culture.
        /// 
        /// Culture of the value
        /// Value to set
        public void Set(CultureInfo culture, string value)
        {
            Set(culture.Name, value);
        }

        /// 
        /// Sets the value of the specified string in the specified name.
        /// 
        /// Name of the culture for the value
        /// Value to set
        public void Set(string name, string value)
        {
            if (string.IsNullOrEmpty(name)) throw new ArgumentNullException("name");
            
            if (!string.IsNullOrEmpty(value))
            {
                var key = name.ToLower();

                if (ContainsKey(key))
                {
                    this[key] = value;
                }
                else
                {
                    Add(key, value);
                }
            }
            else
            {
                if (ContainsKey(name))
                {
                    Remove(name);
                }
            }
        }

        /// 
        /// Tries to get the value of this string for the current culture, returning
        /// the string in the default culture is this fails.
        /// 
        /// Value
        public string GetCurrentOrDefault()
        {
            var value = GetSpecific(CultureHelper.CurrentCulture);

            if (string.IsNullOrEmpty(value))
            {
                value = GetSpecific(CultureHelper.DefaultCulture);
            }
            
            return value;
        }

        /// 
        /// Tries to get the value of this string for the specified culture, returning
        /// the string in the default culture is this fails.
        /// 
        /// Culture of the value
        /// Value
        public string GetSpecificOrDefault(CultureInfo culture)
        {
            var value = GetSpecific(culture);

            if (string.IsNullOrEmpty(value))
            {
                value = GetSpecific(CultureHelper.DefaultCulture);
            }
            
            return value;
        }

        /// 
        /// Tries to get the value of this string for the specified culture, returning
        /// the string in the default culture is this fails.
        /// 
        /// Culture (name) of the value
        /// Value
        public string GetSpecificOrDefault(string name)
        {
            var value = GetSpecific(name);

            if (string.IsNullOrEmpty(value))
            {
                value = GetSpecific(CultureHelper.DefaultCulture);
            }
            
            return value;
        }

        /// 
        /// Tries to get the value of this string for the specified culture
        /// 
        /// Culture of the value
        /// Value or null
        public string GetSpecific(CultureInfo culture)
        {
            if (culture == null) throw new ArgumentNullException("culture");

            return GetSpecific(culture.Name);
        }

        /// 
        /// Tries to get the value of this string for the specified name
        /// 
        /// Culture (name) of the value
        /// Value or null
        public string GetSpecific(string name)
        {
            if (string.IsNullOrEmpty(name)) throw new ArgumentNullException("name");

            var key = name.ToLower();

            return ContainsKey(key) ? this[key] : null;
        }

        #region Overrides

        public override string ToString()
        {
            return GetCurrentOrDefault();
        }

        #endregion
 }

Note1: The CultureHelper class just wraps configuration issues, and is irrelevant for this post.
Note2: I used an ActionFilter - yes, it was an MVC project - to set the current culture for each request.

As persistence of objects was achieved using serialization, I was forced to use an alternative to the IDictionary<TK, TV> class. As I strongly believe in reusing what's there, I googled around for a while and found the code here.

Then I modified my (fictive) Person class like so:
class Product
{
	public Product()
	{
		Name = new CultureNameIndexedString();
		Description = new CultureNameIndexedString();
	}

	public Guid Id { get; set; }
	public CultureNameIndexedString Name { get; set; }
	public CultureNameIndexedString Description { get; set; }
	public double Price { get; set; }
}
Note: Because of the need for XML Serialization, I used concrete type definitions rather than interfaces.

Obviously, I wanted to limit the impact of my changes - especially in the UI - so I created a simple override for the ToString() method. So, read-only access to the strings was unchanged. Obviously, for in some cases, I needed to get the value for a specific culture, typically in the CRUD screens. Setting values, sadly, has become quite different, like so:
var person = new Product { Id = id, Price = price };

person.Name.Set("nl-NL", "Schoen");
person.Name.Set("en-US", "Shoe");
person.Name.Set("it-IT", "Scarpa");
I must say I'm quite content with the way this worked out. Integrating this, in a fairly simple object model, took me only a few hours. Thanx to .NET's brilliant XML Serialization capabilities, I didn't even have to change the data access code! Obviously I broke compatibility with previously serialized data, but you can't have it all! ;-)

For other projects, what I'm concerned about, is how to handle immutable objects. That would be slightly more complex to implement, as I would not be able to simply use a dictionary. Another thing we could do, is extend the above functionality to work with other values as well, perhaps even images ... (it happens that also images need to change, based on the culture)

What do you think? Are there better alternatives? What do you use?

Assessing code, challenge one

As an architect that actually models software, you tend to think that the way that you, as the epic center of the software's design, and the ways you interpret the flood of information that is out there for the grabs, are the only reality. Keeping an open mind, continuously feeding it with new information and challenging your own ideas can be a tough job on it's own. That said, in my honest opinion, those loud people (we all met them at some point) out there that learned something in the past and stick to it with all their might, should not be working on the foundations of our software! Innovation is the key to improvement, self-innovation a logical consequence.

State the obvious

Let's be clear about one thing: Software is a product produced by 'normal' people. It is not some magical entity that at some point came to be out of thin air. It is the result of some intense thought-work that started a couple of decades ago and which still goes on! Regardless of the existence and usage of code generation, until someone invents an autonomous system (there are many movies about it, but it's still a mirage) writing and especially architecting software is going to be a creative process, done by clever people.

Motivation

Recently I have been granted the chance to look into the source-code of two loosely related software products, produced by two completely separated development teams. I was asked to review and assess (one of my next blog posts will be about that process) what was there, as both products had to be moved over to a new (3rd) development team.

 

It was fascinating to see, how both teams took a completely different approach and followed completely different architectural patterns to implement similar and/or equal features. For example, studying the diverse perception of what belonged in which architectural tier and how these tiers were then further segmented into different components, revealed a lot more differences than you'd expect.

 

During my most recent assessment projects, but also when working in/with disperse development teams in the past, I've re-learned that – and this is a Dutch expression – there are more roads that lead to Rome. Moreover, the various routes you might follow, are not necessarily all good or all bad. Basically, common sense and willingness to learn from others, and internet gives us that chance, combining the knowledge you can get with what you as the architect see fit and appropriate guarantees a unique application design that can be anywhere on a scale from 0 (bad) to 10 (good). It all comes down to your judgment, skills, insight and past experience.

The first challenge

When assessing other people's code – usually triggered by yet another party – and being asked to make judgmental calls by non- or less-technical people, they usually expect a good/bad verdict. The challenge is to explain to the client, that what they ask for does not exist!

 

In the past I've had a lot of hard times trying to explain the dynamics of software development/design and stressing that the outcomes of assessments and reviews are subjective. Sure, there are things you could say about implementations flaws, error prone code and (already less objective) good practices. But, when it comes down to arbitration and making recommendations regarding things like layering, segmentation, applied design patterns and separation of concerns, it's basically my word against that of the initial architect.

 

In my opinion, it is crucial that all parties involved are informed about the assessment being done. Also, it is often refreshing, to have all parties read the final report afterwards and merging responses into the ultimate assessment report. The biggest caveat is to 'work for the management' and see all your hard work either go down the drain or see it being used as a whip to punish development teams and architects.

So, what can be done?

My attempted solution to the problem is to manage expectation well from the start. It needs to be absolutely clear, that a second opinion is not to be seen as a final judgment. I do this by saying and writing things like (and this only works when stated with enough confidence and repeated more than once):

 

“I'm very honored that you are trusting my skills and experience enough to judge what is already there. Actually what that means is that you would trust me to architect the software from the beginning, if I was in the picture at that point in time.
I will do some fact-finding based on your own and/or often used guidelines and best practices. If needed, I could do some analysis of performance and stability using a profiler.
Keep in mind, that the recommendations section of the report, will be no more than the name states. It will be a summary of how I would have done things, if I was the initial architect and it will specify some possible actions to be taken which will improve what is there.”

 

I believe that this approach works, because it makes things explicit. Also, it sketches a clear outline of what the client may expect as an end-result. Nothing is worse than a client that expects way more than you can or intend to deliver. Dutch people are known to be direct and to Italians that sometimes comes across as rude. In this case, being explicit is the politest thing to do, because it will prevent needles and frustrating discussions in the end.

Unique Index on URL's in SQL

Recently, while working on Findsi, I created a database design that included a table that basically listed all url's in the system. What seemed like a simple table turned out pretty hard to create. I specifically required my table to hold a unique list of url's, but did not want to use the url itself as the primary key. So i Created something like this: 

  • Id : guid : newid(); < PK
  • Url : varchar(max); < DESIRED UNIQUE INDEX 

Obviously setting the primary key to the Id column was not the problem here. Creating a Unique Index on the Url caused an error stating that only 900 bytes were allowed in each index, what meant either storing only url's shorter than 900 bytes - not an option as i do not control the input of the system - or finding another solution. 

After some serious Googling i concluded two things. First, i was definately not the only one having this issue and secondly, people were not really willing to comment on the subject.  All proposed solutions consisted of in someway hashing the url, and setting the constraint on a column containing the resulting hash. At first i believed this would solve my problem, but typically hash algorythms use the first n bytes for their encryption. Thus, using an 128-bit MD5 hash on two urls with a length longer than 128 (actually 126 i believe, but i'm not sure) chars, having these 1st 128 chars equally filled, would result in the same hash. 

Eventually I got some unexpected help from a University Professor who specializes in Information Retrieval. He assureld me I was not the only one facing this issue and suggested I created two hashes (in code) for each url. One for the first n bytes, the other for the last n bytes. The risk of colissions, using the combined hash result, so he predicted, was extremely low (though not impossible). 

  • StartHash + EndHash: char(256); < PK
  • Url : varchar(max);  

So far, it has worked like a charm for me!

Google cruel? Me stupid? Both?

Ever since i've put my website and this blog online i have been trying to figure out how come my website does not get any ranking of significance within Google's search results. The blog did get indexed fine according to Google Alerts. True, i was perhaps too fast throwing the website online, while i was still working on it, but what's going on?  I need people to find my website in order to find projects!

First try

At first i thought that the problems we caused by the fact that I initially denied all indexing by means of a robots.txt file. I did that so i could work on the content, show it to people and modify it until i was satisfied. I assumed Google had visited my website and had cached the indexing rules, therefore not reindexing my finished content. Consequently i tried to remove the site from Google's index and resubmitting it. That was about a month ago, but no luck ...

Then i started looking in the webmaster tool but the only thing that taught me was that Google assumed the site to have no real relevance and that indexing had occurred more than once without errors. So WTF is wrong with my website and why doesn't Google index it?

Then it hit me

I have loads of duplicate content! How dumb! As the website was a work in progress and i ambitiously wanted to have the site online in 3 languages (Dutch, English, Italian) i started of with only English content. I thought that way i would target the bigger audience. the framework i built my website on (built by myself) sports a CMS that is language aware. Culture is passed through the url of a resource. The sore spot seems to be that, if content is not available in the requested culture, the content is returned in the default culture. Hence ... duplicate content!

The plan

I stop postponing the chore of translating my website into Dutch and do it now. Then i'll remove the Italian selection for a while. Wait and see.

I'll inform you of the results in about a month, fingers crossed!

 

Update (20090202)

I finally figured out the cause. It was NOT Google, but me ... obviously. In Google's terminology it was 'a human error'. ;) The CMS i developed and used for my website, has page level as well as site level settings regarding caching and indexing. Even though i set the indexing to true for all relevant pages, i didn't at a site level. Consequently: true && false = false. Thus: No indexing. I have fixed this issue yesterday and i'm quite sure my site will soon appear in Google's indexes. 

.NET Development on a Mac

I have always been attracted to Apples designs. With the previous models of the MacBook Pro, Apple to me, has proven that it has some first class minimalistic designers at his disposal. About a year ago I could finally shove my HP NC6220 (with great 1440x900 display) aside and bought myself a 17” MacBook Pro.

At first my plan was to use the Mac with Windows Vista 64-bit installed on it. But when I first started up the machine I quickly changed my mind. Not that OSX is that much more beautiful than Windows, just because it was refreshingly different and fast. I than changed my plans and decided to use BootCamp, having both an OSX and a Vista partition.

I don't know how you go about using a new gadget, but I become childishly enthusiastic and impatient. Who needs a manual if you can fiddle about? As I planned to repartition the disk anyway I began playing with the OS until I thought it would suit me best.

Hooked

It wasn't until I started to use my e-mail and address book that I really got hooked. Combined with the Google Apps setup and the Plaxo synchronization tool I was suddenly up-to-date with all my resources. Within a few months Nova Media came out with an iSync plugin for my dual SIM Samsung SGH-D880 and I have never, ever given any thought to synchronizing since.

As my Mac came with Leopard pre-installed, it also came with time-machine. Even though I resent the so-told 3D monkey proof interface, it is a means for backing up that does not disturb you ever while you work. Along with my Mac I purchased an external 1TB Lacie drive, and the TimeMachine partition of 320GB is still 50% empty. On various occasions I have managed to rescue/restore files deleted in one of my feared and famous hard-drive clean-up sessions.

You might also be amazed about the quality of the free and relatively cheap software available for OSX. Adium for example, surpasses anything I have used in the past for my IM needs. OmniFocus and OmniPlan and OfficeTime are rock-solid and well thought through tools I use on a daily basis.

Integration with my Canon DSLR works without any configuration or installation, unlike on Windows. The iLife suite offers a great video editing tool, and it was installed out-of-the-box.

Have you ever worked on a Windows machine that has ben installed more than a year ago? I am doing it on OSX. It is still just as fast and stable as when I booted it up the first time. I'm not 100% sure this is because of the OS, or because working with the virtualization allows me to better separate environments and tools, but I think it's worth mentioning. Also, I tend to just hibernate my Mac instead of shutting it down every time, this immensely increases productivity: Open it up and you're good to go.

So? .NET?

Ok, so I soon decided to use OSX as my base OS, but I still had the need for Windows, the .NET Framework, Visual Studio and IIS, my trusted domain. After reading various articles on both Parallels and VMWare Fusion I decided on the latter and haven't had any regrets.

Good thing about using VMWare Fusion opposed to a BootCamp setup, is that it allows me to run many different OS'es when I need them. I could now develop within a Windows Vista/XP VM and have a Linux VM running in the background with MySQL or whatever exotic platform or application. In building projects that integrate into a bigger enterprise network, consisting of more than one platform, this has proven very effective. I've come to learn that using Windows-ports of especially Linux software usually comes with quirks and thus hard to debug deployment issues.

Until this day, I haven't come across any situation in which my Windows VM's proved to suffer from being virtualized. In terms of performance using Vista Business 64-bit in a VM as my development environment, is much faster than XP performed on my old HP.

As many of the companies i've worked for also use VMWare for server virtualization, I have also had customers suppling me with a full-blown virtualized mirror of their production environment. This saved me spending hours on location testing in a real-live environment.

In terms of testing, of course virtualization has already proven itself. I can now easilly test sites on Linux, Windows and Mac with ease.

Limitations

Probably running the Windows directly would result in slightly better results, but than i'd loose OSX and all the good that comes with it. Also running more than 2 VM's on top of OSX is really pushing the limits. As I have 4 GB and (obviously) just 1 physical hard drive the VM's start to influence  each other's performance negatively.

My MacBook Pro sports a 17” 1920x1200 display. This is great when developing, as the keyboard is of regular size and the high screen resolutions is very handy in Visual Studio and SQL Enterprise Manager. The down-side is that traveling with a laptop this big is plain impractical. Not only because the leg-room on the Transavia flights seems to reduce by the month, but the size and weight make traveling with only hand luggage hard. Are they squeezing in more rows of seats, or is it just my imagination? My debating skills have improved though, after convincing many, many, many flight attendants to allow me to bring my uber-gadget and my suitcase aboard. I'm not sure what i'd leave behind if they wouldn't  have given in … ;)

To me, the new Apple laptops esthetically lost their edge (who the hell came up with the black keyboard?), but i've read that the internals only got better. I'm not sure i'd accept the glossy screen if I would be buying right now. I see some pretty and powerful machines coming from Sony and Asus that have more appeal to them.

The MacBook Pro is no doubt a hot machine. But it is also literally! The aluminum of the casing reaches immensely high temperatures. I guess the pretty design didn't mix well with the practical cooling needs. Working with it on your lap is not always fun, I bought myself an iRain stand and tend to use it on a desk otherwise.

Concluding

I'm extremely content with my setup and hope to use it for another couple of years before upgrading. If the above wasn't clear enough, I'll conclude with this concrete statement: After a year of intensive work on my Mac, I can recommend using a MacBook Pro for .Net development to anyone who wants 'best of both worlds'.
Powered by BlogEngine.NET 1.5.0.7