Zamboch: C#

Showing posts with label C#. Show all posts

Wednesday, March 7, 2012

Unit of work using lambdas

Are you tired of writing into log file on every begin and end of the method ? Use AOP. Well, not yet ready for post-compiler magic or dynamic runtime proxies ? Let's try lambdas (again).

static void Main(string[] args)
{
    Scope<LoggingFrame>.Wrap(() =>
    {
        Console.WriteLine("Hello world");
    });
}

Would produce this console

2012-03-07 00:46:58,248 DEBUG - BEGIN Main
Hello world
2012-03-07 00:46:58,258 DEBUG - END   Main

How to implement that ?

[DebuggerStepThrough]
public class Scope<TAdvice> where TAdvice : IAdvice, new()
{
    public static void Wrap(Action body)
    {
        IAdvice advice = Activator.CreateInstance<TAdvice>();
        advice.OnEntry(body);
        try
        {
            body();
            advice.OnLeave(body);
        }
        catch (Exception ex)
        {
            advice.OnException(body, ex);
            throw;
        }
        finally
        {
            advice.OnFinally(body);
        }
    }
}

public interface IAdvice
{
    void OnEntry(Delegate body);
    void OnLeave(Delegate body);
    void OnException(Delegate body, Exception exception);
    void OnFinally(Delegate body);
}

Implementation of the LoggingFrame advice is trivial.

Note that you could get similar behavior with IDisposable and using keyword, but you would not be able to log pending exception.
You could also think about TransactionScope, which would call tx.Complete() automatically when no exception is thrown.
Further improvement is to use dependency injection to instantiate the advises.

Another use-case is to implement Unit of Work or session/call context, while using TSL to reach topmost frame. I used it for NHibernate Session and EF DbContext (unit of work) recently. Interesting related article here.

private void Main(string[] args)
{
    Scope<UnitOfWork>.Wrap(session =>
    {
        var people = session.Person
            .Where(person => person.FirstName == "Pavel")
            .ToList();

        NestedLogic(people);

        //DbContext.SaveChanges() will be called here
    });

}

private void NestedLogic(IList<Person> people)
{
    //this will lookup parent session in TSL and reuse it
    Scope<UnitOfWork>.Wrap(session =>
    {
        foreach (var person in people)
        {
            if (person.LastName == "Savara")
            {
                person.Coder = true;
            }
            else
            {
                session.Person.Remove(person);
            }
        }
    });
}

Composition of scopes could be beautified.

Scope<LoggingFrame, TransactionFrame>.Wrap(() =>
{
    throw new Exception("rollback please");
});


[DebuggerStepThrough]
public class Scope<TOuterAdvice, TInnerAdvice> 
  where TOuterAdvice : IAdvice, new() 
  where TInnerAdvice : IAdvice, new()
{
    public static void Wrap(Action body)
    {
        Scope<TOuterAdvice>.Wrap(()=> Scope<TInnerAdvice>.Wrap(body));
    }
}

All code in the article is simplified and real implementation is exercise for readers.

Enjoy :-)

Saturday, December 31, 2011

Fluentator - generate fluent API for your structures

When working with nested structures like configuration or XML it is bit of pain with syntax in C#. Consider this code below. The object initializes in C# 3.0 helped a lot, but it's still pretty far from ideal. Important point here is readability, which is achieved thru nesting the initializers.

var model = new Model();
var pavel = new Employee("Pavel");
model.Companies.Add(new Company("Boldbrick & co.")
{
    Departments = new List<Department>
    {
        new Department("Software & Visions", "swv")
        {
            Teams = new List<Team>
            {
                new Team("Visions")
                {
                    Employees = new List<Employee>
                    {
                        // I was forced to move 
                        // pavel variable declaration completely out of scope
                        pavel,
                        new Employee("Ondra"),
                    },
                    IsAwesome = true,
                },
                new Team("Developers")
                {
                    Employees = new List<Employee>
                    (
                        // I can't do any statements or declarations here
                        // to prepare my data in-place
                        devNames.Select(n=>new Employee(n))
                    )
                    {
                        // note I can't add pavel first
                        pavel,
                    }
                }
            }
        }
    }
});

But there are downsides with approach above. You can't easily add same instance into 2 nodes. It forces you to declare pavel variable completely out of scope. And you can't use statements to prepare your data in place either. Note how Pavel is inserted after other employees into Developers team. The LINQ Select() helped great deal here, but it's not always possible to use it. With more complex model and bigger tree to build, this will become unmanageable mess.

So, extension methods and lambdas to rescue. Do you like code below better ? I certainly do. I can use statements and variable declarations in inner scope. I get more dense and readable code.

var model = new Model();
model.AddCompany("Boldbrick & co.", bb =>
{
    bb.AddDepartment("Software & Visions", "swv", swv =>
    {
        // variable is still bit out of scope, but not in the root scope
        var pavel = new Employee("Pavel");
        swv.AddTeam("Visions", visions =>
        {
            visions.AddEmployee(pavel);
            visions.AddEmployee("Ondra");
            visions.IsAwesome = true;
        });
        swv.AddTeam("Developers", devs => 
        {
            devs.AddEmployee(pavel);
            // I can add more employees after Pavel
            devs.AddEmployees(devNames.Select(n => new Employee(n)), dev=>
            {
                dev.Age = 33;
            });
            // and also can use any complex statement in-place
            for (int i = 0; i < devNames.Count; i++)
            {
                int ix=i;
                devs.AddEmployee(devNames[i], dev =>
                {
                    dev.Age = ix;
                });
            }
        });
    });
);});

How the extension method looks like ?

Below is extension method over external structure Department, which accepts same parameters as Team constructor. Inside is instance creations and adding to the collection. Finally to allow the nesting of scopes, we pass the new instance to Action<> delegate.

static public Team AddTeam(this Department self, string name, Action<Team> result = null)
{
    var item = new Team(name);
    self.Teams.Add(item);
    if (result != null) result(item);
    return item;
}

Generate the extensions

The extension method above is nice and useful but it's quite boring to write it for each combination of container and child item. Multiplied by all constructor signatures. So I decided to create ReflectionFluentator which can generate the code for you. It reads your model via reflection and generates the C# extensions. See sample how to use the generator.

And the same thing for XML ? Sure. You provide the XSD to XsdFluentator. See sample how to use the generator.

var doc=new XDocument();
doc.AddLibrary("Prague",prague =>
{
    prague.AddBook("Saturnin", book =>
    {
        book.AddAuthor("Zdenek Jirotka");
    });
    prague.AddBook("Bylo Nas 5", book =>
    {
        book.AddAuthor("Karel Polacek");
    });
});

This code can generate this XML as expected.

<library id="Prague"
  xmlns="http://polyglottos.googlecode.com/svn/trunk/demomodel/library.xsd">
  <book name="Saturnin">
    <author name="Zdenek Jirotka" />
  </book>
  <book name="Bylo Nas 5">
    <author name="Karel Polacek" />
  </book>
</library>

Note that both generators are more prototypes than ready to ship. They don't handle any edge scenarios when reading the metadata or writing the code. Fluentator is part of Polyglottos project. If you like the idea and wish to contribute improvements, please talk back.

At this point some of you may wonder what else could be made fluent this way. In my case, I realized that I need to generate the code and the code is just nested structure. So I created CodeDom code generator Polyglottos with fluent API. That's for another article next year. Enjoy the party tonight!

Saturday, March 26, 2011

Raising Property Changed fast and safe - yet another solution

This Thursday we discussed with my colleagues how to raise property changed event. We learned that it could be done
a) via string literal,
b) via lambda expression tree,
c) via StackTrace or MethodBase.GetCurrentMethod().
Nice summary is here. or here

We did some naive performance test and learned that for 500k iterations on simple property, there is big difference in speed.
a) is really fast, 13ms
b) is slow, 1417ms, I guess because security checks apply for every call and every stack frame, and can't be fired from other properties.
c) is slow, 2518ms, very slow, because expression tree is constructed for every call

So we played with it for a while and I invented solution with anonymous type (at botom).

Update 27.3.2011: I was so focused on speed that I didn't realized that the solution yesterday didn't work nicely for rename. What a shame. I hope people give me another chance today. Here we have postponed creation of the expression tree, it takes about 75ms. And this time my ReSharper renames it correctly :-)
Update 5.12.2011: As Alex noted in comments, there was thread safety issue with Dictionary. The code is now fixed with Hashtable instead.

public abstract class ModelBase : INotifyPropertyChanged
{
    public event PropertyChangedEventHandler PropertyChanged;
    private static readonly Hashtable Names = new Hashtable();

    protected void RaisePropertyChange(Func<string> key)
    {
        if (PropertyChanged != null)
        {
            var propertyName = (PropertyChangedEventArgs)Names[key];
            if (propertyName==null)
            {
                propertyName = new PropertyChangedEventArgs(key());
                lock (Names)
                {
                    Names[key] = propertyName;
                }
            }
            PropertyChanged(this, propertyName);
        }
    }

    protected static string Reg<T>(Expression<Func<T>> property)
    {
        return (property.Body as MemberExpression).Member.Name;
    }
}

public class Model : ModelBase
{
    private string firstName2;
    public string FirstName2
    {
        get { return firstName2; }
        set
        {
            firstName2 = value;
            RaisePropertyChange(() => Reg(() => FirstName2));
        }
    }
}

Doesn't work: The lambda just helps to define the type without creating the instance and calling the getter. It takes about 77ms for half million calls.

public abstract class ModelBase : INotifyPropertyChanged
    {
        public event PropertyChangedEventHandler PropertyChanged;
        private static readonly Dictionary<Type, PropertyChangedEventArgs> Names = 
          new Dictionary<Type, PropertyChangedEventArgs>();

        protected void RaisePropertyChange<T>(Func<T> f)
        {
            if (PropertyChanged!=null)
            {
                var key = typeof(T);
                PropertyChangedEventArgs evntArgs;
                if (!Names.TryGetValue(key, out evntArgs))
                {
                    var propertyName = key.GetProperties()[0].Name;
                    evntArgs = new PropertyChangedEventArgs(propertyName);
                    lock(Names)
                    {
                        Names[key] = evntArgs;
                    }
                }
                PropertyChanged(this, evntArgs);
            }
        }
    }

    public class Model : ModelBase
    {
        private string firstName;
        public string FirstName
        {
            get { return firstName; }
            set {
                if (firstName != value)
                {
                    firstName = value;
                    RaisePropertyChange(() => new {FirstName});
                }
            }
        }
    }

Tuesday, November 3, 2009

How calling from Java to .NET works in jni4net

Tonight I would like to explain how calling from Java to .NET works with jni4net. The other way around is described in previous article, it would be good if you read that first if you didn't yet.

So, in JNI there is method which allow you to register native implementation of Java method. Such method had "native" keyword in it's signature. I believe that it's actually how low level methods in Java runtime are implemented.

We use proxygen tool, which is part of jni4net, to reflect any public members of .NET type and then to generate Java proxies. Those proxy methods are marked native. The generator must do few translations to follow Java naming rules. Namespaces are always lower-cased to become packages. Properties are converted to bean-like get and set methods.
jni4net JVMProxy

For exceptions the situation is similar. system.Exception proxy is inherited from Java java.lang.RuntimeException to be throw-able. Therefore it isn't inherited from system.Object proxy. But there is system.IObject interface on both of them for your convenience.

The proxy implements Java finalizer, so when JVM GC cleans up the proxy instance, the real CLR instance is released as well. The reference to CLR is implemented as GCHandle converted to long and kept on JVM side inside the proxy.

system.Object proxy as well overrides toString() method and forwards it to CLR. It seems that in version 0.4 I forgot to override hashCode() and equals(), sorry about that ;-)

The implementation is is asymmetric to CLR->JVM proxies, because for both it's CLR which does the translation work. The wrapper have signature which is expected by JNI RegisterNatives. I apply Marshal.GetFunctionPointerForDelegate to wrapper methods to get endpoint callable by JVM. Side note is that JNI RegisterNatives does not make any difference between static and instance methods. They are identified by name and signature only, strange limitation.

Last trick I mention is about initial CLR bootstrap. During initial design I realized that there is way how to export native method from managed DLL. So I currently use that trick to create DLL with signature friendly to JVM. It seems to me now, that I will need to change my approach soon in order to support multiple versions of CLR and Mono on Linux. Probably I would need to use C++/COM. For now, we have 32bit and 64bit CLR 2.0 on Windows.

Next time maybe about interfaces and type casting.

Saturday, October 31, 2009

How calling from .NET to Java works in jni4net

In this article I would like to explain how calling from .NET to Java works with jni4net.

First of all you should know that Java Virtual Machine (JVM) is exposing Java Native Interface (JNI). It is native/binary interface which allows other programs to control JVM: load classes, create instances and run methods. It as well allows you to control object lifetime by holding handle to instance. Usual way how to consume JNI interface is to use %JDK%\include\jni.h in your C++ program and then load jvm.DLL. I wanted solution for .NET, so I use [DllImport] attribute and I translated the header file from C to C#. I also converted all pointers and handles to IntPtr. So now we have JNI interface accessible to any .NET code.

Now it's quite easy to use JNI methods to call Java objects, but it's far from convenient. So the next step was to use Java reflection and get signatures of core objects and generate proxies. The proxies look like Java classes, they have same name, namespace, same methods with same signatures. But it don't have the implementation in .NET but rather they call the real implementation in JVM using JNI. The reflection and proxy code generator is reusable idea, so I created 'proxygen', it's tool which is part of jni4net package. You could use it to wrap your own classes.
jni4net JVMProxy

Now you should know few tricks. JNI calls are done on handle of the method, which could be retrieved by method name and signature. See javap. To make it faster, I pre-bind the proxy-class to real-JVM-class during start-up.

To make the proxy good citizen in .NET world, java.lang.Object overrides Equals(), GetHashCode(), ToString() .NET methods and forward the calls to appropriate Java equivalent. You should be aware that there could be multiple proxies for same Java instance. java.lang.String proxy implements implicit conversion to CLR String.

Because we need garbage collection work, proxy implements finalizer. When CLR garbage collector finds lonely proxy, we release JNI handle of the real Java instance, so it could be collected on JVM side as well. Warning, it is possible to create cycle between objects across both heaps, which would prevent GC from collection. I have no good solution for that now, just be careful.

Another feature to mention is exceptions. To make Java exceptions useful in CLR, they should be inherited from System.Exception. Simply to be able to throw and catch them in .NET. But it means that they could now be inherited from java.lang.Object proxy. This is compromise, but it's worth of it, because now you could catch exceptions thrown by Java method in .NET code because jni4net does transparent translation for you. You will receive proxy of exception. To overcome the problem with common base class, I introduced java_.lang.IObject interface with all expected methods.

Last note would be about call marshalling. It's done with PInvoke on CLR side and with JNI on JVM side. The primitive types are very well compatible. Unsigned types from CLR are transmitted binary. Strings are translated as Unicode, it's bit slow. Any other non-primitive parameter is Java object. Which means you need to pass another proxy/Java object as parameter (for brevity we ignore interfaces now). When returning from the call, we could have return value. It is the way how to send Java instances back to .NET. jni4net/proxy wraps the JNI handle for you. It finds best-fit proxy-class and returns its instance. The proxy contains JNI handle to the real Java instance.

Next time about calling back from JVM to .NET and proxies of CLR objects in Java.

Saturday, October 10, 2009

jni4net - bridge JVM and CLR

I'm proud to present first public release of jni4net - Object oriented, fast, intraprocess bridge between JVM and CLR.

Hello World!

From C# to Java

using net.sf.jni4net;
public class Program
{
    private static void Main()
    {
        Bridge.CreateJVM(new BridgeSetup());
        java.lang.System.@out.println("Greetings from C# to Java world!");
    }
}

This is very basic demonstration of the principle, full version of this sample and another 3 sample applications could be found in binary distribution. The others are same Hello World but from Java to .NET, usage of Apache FOP from C# for xsl:fo and last is creation of WinForms dialog from Java.

Please download here and talk back.

Next time I will write about why I created it, how it works, what are the features and what are the next steps. Watch this space.

Tuesday, June 16, 2009

How to get non inherited interfaces

Is documented fact that GetInterfaces() method returns all interfaces implemented by given type including inherited ones. I wanted to get only interfaces defined by the class, and was searching for the solution. No luck. But it's really simple.
Just check if base class is implementing the interface, if yes filter it out.

Type type = typeof(Type);
Type[] nonInheritedInterfaces = type.GetInterfaces().Where((ifc)
     => !ifc.IsAssignableFrom(type.BaseType))
  .ToArray();

*Yes, if you are still on 2.0 you could rewrite it to loop.

Friday, February 22, 2008

Lisp lectures IIa - Higher-order functions

I'm done with lesson IIa. That one was pretty hard. In Lisp it would be probably quite easy, but not in C#.
First thing, you need to switch from delegates to expression trees. There is no major problem to wrap delegates from my last article using Expression<>. Only problem is funtion folding. Consider this:

Expression<Func<double, double, double>> average = (y, x) => (x + y) / 2;
Expression<Func<double, double, double>> newGuess = 
  (guess, x) => average(guess, x / guess);

It will not work, compiler will say: "'average' is a 'variable' but is used like a 'method'". I believe that C# is missing some sugar here, does anyone know why ? And there is quick and ugly solution to the problem:

Expression<Func<double, double, double>> average = (y, x) => (x + y) / 2;
Expression<Func<double, double, double>> newGuess =
  (guess, x) =>  average.Compile().Invoke(guess, x / guess);

I implemented smart trick for expression folding. Let's have static helper class ExComp and this folding method Fold() with this signature:

public static TResult Fold<T1, TResult>(Expression<Func<T1, TResult>> expr, T1 param1)

Of course it could be implemented with Compile().Invoke(), but instead I'm post-processing the expression tree to get rid of method call and folding inner tree in place using InvocationExpression. So whole usage looks like this:

Expression<Func<double, double>> abs = (a) => a < 0 ? -a : a;
Expression<Func<double, double, double>> distance = 
    (a, b) => ExComp.Fold(abs, a - b);
double dist = distance.Process().Compile().Invoke(9,1);

And the tree after processing looks like this:

(a, b) => Invoke(a => IIF((a < 0), -a, a),(a - b))

Next we will hit problem with recursion, which works perfectly for delegates. This is factorial as simpler example of it :

Func<int, int> fact = null;
fact = (y) => y > 1 ? (y) * fact(y - 1) : 1;

Now serious troubles starting with expression trees. Emphasis on TREES. So, there is no good way how to turn expression tree into oriented graph of expressions. Only thing you could do is to use some external variable which has pointer back to root of the tree. For this I'm using similar trick as with folding above, fake method Self() and post-processing.

Back to the Lisp lecture. Now there is introduced idea of higher-order function. The function which creates and returns function, or could compose given function and return composition. Now when we have friendly expression folding, it's a piece of cake.

Expression<Func<double, double>> abs = (a) => a < 0 ? -a : a;
//folding function
Expression<Func<
    Expression<Func<double, double>>, //absFn
    Expression<Func<double, double, double>> //resultFn
    >> foldFn = ( absFn ) => ( a, b ) => ExComp.Fold(absFn, a - b);
//we are passing abs to get abs(a-b)
Expression<Func<double, double, double>> 
  distance = foldFn.Process().Compile().Invoke(abs);
double dist = distance.Compile().Invoke(9, 25);

So now we have foundation for completing square root function using reusable functions.

Expression<Func<double, double, double>> average = (a, b) => (a + b) / 2;
Expression<Func<double, double>> abs = (a) => a < 0 ? -a : a;
Expression<Func<double, double, double>> distance = (a, b) => ExComp.Fold(abs, a - b);

Expression<Func<
    Expression<Func<double, double, double>>, //stepFn
    Expression<Func<double, double, double>> //resultFn
>> avgDamp = (stepFn) =>
    (old, data) =>
        ExComp.Fold(average, old, ExComp.Fold(stepFn, old, data));

Expression<Func<
    Expression<Func<double, double>>, //testFn
    Expression<Func<double, double, double, bool>> //resultFn
>> test = (testFn) =>
    (guess, data, tolerance) =>
        ExComp.Fold(distance, ExComp.Fold(testFn, guess), data) < tolerance;

Expression<Func<double, double, double, double>> sqrtFixedPoint = null;

Expression<Func<
    Expression<Func<double, double, double, bool>>, //goodEnough
    Expression<Func<double, double, double>>, //newGuess
    Expression<Func<double, double, double, double>> //resultFn
>> fixedPoint =
   (goodEnough, newGuess) =>
        (oldGuess, data, tolerance) =>
                (ExComp.Fold(goodEnough, oldGuess, data, tolerance)) ? oldGuess
                    : ExComp.Self(sqrtFixedPoint,
                          ExComp.Fold(newGuess, oldGuess, data), data, tolerance);


Expression<Func<double, double>> square = a => a * a;
Expression<Func<double, double, double>> step = (old, data) => data / old;

sqrtFixedPoint =
    fixedPoint.Process().Compile().Invoke(
        test.Process().Compile().Invoke(square),
        avgDamp.Process().Compile().Invoke(step)
    );


Expression<Func<double, double>> squareRoot = 
  (x) => ExComp.Fold(sqrtFixedPoint, 1, x, 0.000000000000001);

double root = squareRoot.Invoke(25);

Code as well as post-processing helper class could be downloaded here

Thursday, February 21, 2008

Lisp lectures

In the last months I'm thinking about dynamic languages and trying to understand what's the good part of it. Driven by this curiosity I've came to Lisp lectures by Abelson and Sussman. Yesterday I watched first lecture which is introduction to prefix notation, expressions and functional concept.
As a exercise they model and implement function for computation of square root. My attention was attracted by functional concept, so I tried to implement it. But I was too lazy to get and install any Lisp. Instead of it I implemented it in C# 3.0 using lambda expressions.
I'm curious how far I could get with this approach and whether it will be possible to convert most of the concepts of Lisp and do them in C#. I guess that I will hit the barrier as soon as more dynamic features of Lisp will be shown in next lectures. I will see.

Func<double, double, double> average = (y, x) => (x + y) / 2;
Func<double, double> square = x => x * x;
Func<double, double> abs = (x) => x < 0 ? -x : x;
Func<double, double, double> newGuess = 
  (guess, x) => average(guess, x / guess);
Func<double, double, double, bool> goodEnough = 
  (guess, x, tolerance) => (abs(square(guess) - x) < tolerance);
Func<double, double, double, double> tryGuess = null;
tryGuess = ( guess, x, tolerance ) => 
  goodEnough(guess, x, tolerance)
    ? guess
    : tryGuess(newGuess(guess, x), x, tolerance);
Func<double, double> squareRoot = ( x ) => tryGuess(1, x, 0.001);

double root = squareRoot(2);

Tuesday, November 13, 2007

Cube21 Silver edition - database of all solutions

Cube21 or Square One is puzzle similar to Rubik's Cube. I'm proud owner of one from Silver edition (which have only two colors) since year 2000. Regular Cube21 has 6 colors. This cube is different from Rubik's mainly by changing it's shape. There are 90 distinct shapes. Good analysis could be found here so I don't need to repeat that now, more Cube21 links below.

Cube21

Because I was always more interested in programming than solving the cube by hands, I started to solve the cube by program. Soon I realized that deep search with some heuristics is not that funny. Same time my friend Berka provoked me to create something more challenging, the database of all shortest solutions.

Now after years I returned to that idea to exercise in new technologies. I implemented analytic part of the work in C#, brute force part in C++/CLI. User interface is done in 3D WPF. Because you will probably not want generate whole database at home, I connected the clients to database on my machine using WCF web service. Finally some of you may not have windows platform at all, but you would like to use the database. So for you I created simple ASP+AJAX form. Now some video of the client GUI solving the cube.

Model

So let's describe the model: cube has 16 pieces (8 small + 8 big). Those small are only of 4 types as my cube has only 2 colors. Big ones are unique. On middle layer, there are 2 halves, connected with axis. When you turn around the axis you must finish 180 degrees, so it has only two states. *We don't need to care about middle layer.* <- Update at the bottom.

Cube21

Now let's assign pieces some (hexadecimal) numbers. It's anticlockwise, first on top then on bottom half. Numbers are assigned on solved cube, so that 01234567-89ABCDEF is sequence of solved cube. Detailed model and guide to your puzzled cube here.

Note that small pieces are there twice so we have permutation of multiset Small 004488CC and big 13579BDF, this will give us 2.2*10^13 permutations. Let's also say that we could consider same all cube configurations which could be transformed to each other without turning left and right half of the cube. So rotation of top, bottom and flipping whole cube is still the same cube. Last note that we will mark solved cube with level 1, there is only cube on that level. When performed any operations with top, bottom and finally make turn of right part, we reached level 2. We could found all cubes on level 2 by doing all possible operations on cube on level 1. Level 3 will be result of all operations on all cubes on level 2, which are not on level 1 or 2. Similarly for other levels.

Notation of moves is (top, bottom), '/' is turn right half, '!' is flip whole cube. For both top and bottom rotation, positive is anticlockwise.

-- Here was Ajax frame with webservice to database. Is gone now.

Algorithmization

This is the best part, rest of the work was just cheery on the cake. My first idea years ago was very naive: do operations on cube model, then search whether cube is in the list of already found cubes and if not add it into the list. This doesn't work long, because growing list of cubes will be slower and slower. So, sort it and binary search ? Not in memory, because count of cubes is cca 6.8*10^9, that could not fit into 32bit memory space. Ok, we could B-tree on disk, maybe even with some existing database engine.

But there is better solution. For each cube configuration = permutation we could assign index of permutation. On such index we will store just level of the cube. But 2.2*10^13 permutations will give us 20 terabytes of data, not good yet. We know that there is 90 shapes, each shape has exact order of small and big pieces. For example CeCeCeCe/CeCeCeCe is shape of solved cube where 'C' is big and 'e' is small piece. So now we have 90 pieces * 2520 permutations of small pieces * 40320 permutations of big pieces. That is 9144576000 permutation and bytes. I guessed that there is no more than 15 levels (actually only 12), so we could use half byte to store level of any cube. This is 4.2 GB, good enough for my PC!

Now we could for each level go through whole database, and when we found some half-byte which is set to current level, we will do all possible operations on the cube which is described by address of such half-byte. Results will be converted back to address of new cube and written to the database, if new cube is not on same or previous level already. Before saving we should normalize the cube to one of 90 shapes, by rotating top, bottom or flipping whole cube.

Well we could save some work in analysis, we could skip all operations on the shape which will lead to same result for example (0,0)/! is same as (6,6)/ for all shapes. Also we could minimalize those shapes which stay in same shape after rotation, for example (3,6) on solved cube will not change shape, just colors. So we could only write that rotated cube, which has minimal index of color permutation. This will eliminate identical cubes from processing in next level.

Now we need some clever paging because is not possible get it all in memory at once with 2GB address space. There helps that transitions are always between pair of shapes, so we could use file of one shape as source and other as target. Memory mapped file is cool solution for shape files (50MB each). Also helps to lazy flush that files. And lastly try to order processing of page pairs in that order, that just few loading and flushing is performed for each file, because disk is slooow. This could be done because shapes don't produce cubes in all other shapes, just neighbor. Also helps that file is created whole at once to prevent fragmentation on disk. Windows Vista paging seems do better work than XP when working with mapped files. To keep busy processors and disk all the time, we could run more threads than we have processor cores, each thread one pair of shapes. When some thread is waiting for disk, other thread is heating the core.

Now we could optimize the operations on the cube. Here comes the C++/Cli. We generate optimized code branch for expanding cubes, specific for source and target shape. We could also optimize computation of index of permutation, because indexes of 2520 and 40320 permutations could be done by hash table and by lookup table on way back to permutation. And we could also do whole turn (rotation of top bottom, turn and rotation of top and bottom as normalization to the shape) just by using bitwise AND, OR and rotation on processor registers. Such operation could be done separately on permutation of small pieces and big pieces 32 bits each. Also prevent CLR to native and back transitions, so code for whole shape file scan should be native code.

Mike Masonjones did probably something similar in year 2005, maybe he had little bit harder problem, as he was solving cube with 6 colors. He had some old machine which was dedicated to the task and was running a year! My algorithm could do it in cca 8 hours. I have Dual core 2.4GHz + 4GB RAM, but I'm still happy with 1000 times speedup.

I had great time playing with that problem and learned lot. Interesting ? Found bug ? Any feedback ? Enjoy !

Numbers about silver edition

Level 1	1
Level 2	32
Level 3	536
Level 4	7652
Level 5	102204
Level 6	1296753
Level 7	15250126
Level 8	153335000
Level 9	1138303312
Level 10	4067633460
Level 11	1434709972
Level 12	164952

Total 6810804000 configurations.

Download

Source code under LGPL on SourceForge subversion.
GUI client + database creation engine binaries, .NET 3.0 required. Analysis report
Database of shapes in XML

Cube21 Links

Heymbeeck Joël
Fred Curtis
Jaap Scherphuis

Update: This is probably wrong. My idea was that the correction could be made by merging /(0,6)/(0,6)/(0,6) to last three turns when necessary. I currently don't have time to think about it deeply, but it seems to me now, that it wouldn't work for all shapes. I probably it could mean that my database is not 100% correct and that some paths will be one turn longer. It could also force me to store another bit of information in permutation, so disk space required could be twice bigger. I will welcome anyone who will do the fix :-)

Tuesday, October 2, 2007

Ranking and unranking permutations of multiset

Few years ago I started to play with my toy Cube21 programaticaly. I will probably write another article about that later. For that toy needed to get index of permutation of multiset (set with repetitions). I searched internet and was unable to find any algorithm which could do that. Best match I have found those days was Myrvold&Ruskey's Ranking of permutation. Very nice algo, but I needed multiset, so I developed one.

So, now I'm playing with toy again and decided to share my ranking algorithm with community.

This is C# class with two methods which are inverse. First will give you index of permutation, second permutation on index. You should initialize instance of class for particular multiset. Order of permutations is lexicographic. Also note, that this implementation operates on int data type, so it should be rewritten for bigger permutations.

Myrvold&Ruskey were able to create linear time algo (with number of set members). I tried hard, but for multiset I was unable to reach better time than circa O(n*(t/2)), where n is number of members and t is number of types of members (or members without repetitions).

I'm interested to learn more about topic, so please share your thoughts. Or if you have faster algo, please, let me know.

There is also zip file with whole VS solution.

Update More detailed explanation.

Zamboch