Tuesday, October 30, 2007

Recapitulation of hunting for developer's job

Last months I was in process of job change, so I attended lots of interviews and now I would like to share my conclusion about that process.

Usually both job aspirants and hiring companies are prepared to examine candidate. But I'm quite surprised that sometimes both parties fail to get/provide useful information in the other direction about the company to the candidate. I understand that companies are brief when advertising job offer, because lot of information could be seen as very internal. Note that candidate is providing very personal information.

But interview should be a meeting with balanced opportunities, as not only company is choosing candidate, but candidate is choosing company as well. Job is a trade, trade works only if both parties like that. I attended few interviews where we spent lot of time speaking about me, but when I started to ask questions about company and team, people had hard time, because they were not prepared for such questions and we also had lack of time to talk about it.

I assembled list of questions which helped me to find (hopefully) right job. I was able to identify what's important for me and what's not. I also used that list to inquire potential employers about their company and work style during the interviews. Hope that the list will help both parties of our society, candidates and companies to speak about important questions.

Those topics below are useful to get birds view of the company. Whether it's healthy, stable and has business background. First of all candidate should try to gather information from the internet, hopefully such information could be found on company's web site.
  • Products, Projects & Customers
  • Business partners, target platforms
  • Structure of customers, market share
  • Business model, licenses, SLA
  • Long-term product vision a strategy

Process. This is how they do That, this has major impact on daily life in company, how they are structured, how rigid process is. Whether you will have opportunity to change something, whether they are agile or not. Stress level, chaos level, amount of boring tasks. You may like to see example of analysis and design documentation. Also product documentation is worth seeing.
  • Development methodology, project management style
  • Work item management. Time tracking
  • How they choose new features, new projects
  • Anaysis, Architecture, Design
  • QA process, Unit testing
  • Configuration management, Deployment
  • Release management

Technologies & tool-set. This is important because older technologies are, bigger effort is spent on problems with infrastructure things. Missing IDE, debuggers, old compilers, missing libraries and components, support for too old hardware count here. Do they use third party components or homegrown ? Do they have right tools for their job ? Can they easily get them ? Are they able to upgrade technology time to time ? You would like to see some example of source code, to see their coding standard, amount of comments and readability.
  • Technologies, IDE, Tools
  • Libraries, components
  • Databases, OR mappers, code generators
  • Source version control, branches, releases, merging, workspaces
  • Long-term technological vision

Working environment.
  • People, communication in team, general mood
  • Education, hard-skills and soft-skills courses
  • Long term vision for local office, head counts
  • Occasional working online from home ?
  • Time to learn new things during work
  • Work vs. life balance in the company
  • Offices, dress code
  • Company events, sport and culture events

I also learned that some recruiting agencies could heavily damage your CV you gave them, even without your knowledge. They do that to assure that the employer don't have direct contact with you. In my case it was removal not only of email address, but also of links to my blog and to my open-source projects. This hurt a lot because there are examples of my work and I could get some extra points there. Also it broke formatting of that document. Next time I will distribute PDF file.

Of course I'm interested in your opinions!

Tuesday, October 23, 2007

Ranking permutations of multiset - more details

I refactored and documented my algorithms, which I described before to be more readable. Now I will try to explain how it work.

So, let's have multi-set, set with repetitions. For example (0,0,1,2). Let's agree that list of permutations will be ordered in lexicographic order. Permutations are indexed from zero.
0012


My algorithms are able to find index of given permutation, this is called Rank(). And for given index generate the permutation, this is called Unrank(). Now I will explain Unrank(), which is easier to understand. Let's try to find permutation with index 7 from total 12 possible permutations.
0012-2
We will iterate thru all 4 positions from,left to right and identify members of multiset which should be there. We fill now follow the first iteration of outer loop in algo below. So for first position, you can already see that we should choose member type 1, but in the algo we don't know it yet. Let's look at first column, what possibilities we have.
0012-2
As you can see, there is imbalance between number of permutations which start with member type 0. This is caused by fact that member type 0 appears twice in input multiset. Or from other viewpoint the number of possibilities (permutations) for sub-multisets is bigger for (0,1,2) than for (0,0,1) or (0,0,2). Those sub-multisets are defined as original multiset without member already used (chosen).
0012-2

Back to row on index 7, so we will shrink index(in range of potential) to fit to range of length. Then we will scan for offset of group which has selector inside of it's range. For our example it will be group of type 1 starting at offset 6. Now after some expanding we will subtract offset from index (we will shift base), so that all possible indexes are in range of new potential which is computed on next line. This potential is potential of sub-multiset of remaining, not yet used members. The new index (for our example) for that sub-multiset is 1. Outer loop will process next column now.

Now it should be easy follow the algo. Note that number of permutation is called 'potential' here and that some helper structures were filled before call to that method.
function Unrank(permutationIndex)
{
    // initialize: length, maximal potential and counts could be prepared for all ranking operation on same multiset
    currentPotential = maxPotential;
    currentLength = length;
    currentCounts[] = preparedCounts[];

    for each position from left
    {
        // note that 
        //  - sum of count of all currenttly remaining types is length of multiset
        //  - current potential is sum of potentials of sub-multisets
        
        // compute selector, which is just rank shrink to be in range between zero and current length of multiset
        selector = ((permutationIndex * currentLength) / currentPotential);
    
        // initialize
        offset = 0; type = 0;
    
        // scanning for offset of sub-multiset in which range selector could be found
        while ((offset + currentCounts[type]) <= selector)
        {
            offset += currentCounts[type];
            type++;
        }
    
        // remove consumed offset
        permutationIndex -= (currentPotential * offset) / currentLength;
    
        // compute potential of sub-multiset
        currentPotential = (currentPotential * currentCounts[type]) / currentLength;
    
        // consume type
        currentCounts[type]--;
        
        // store chosen type
        result[position] = type;
    }
    return result;
}

Ranking is very similar to this, but just inverse.

Ideas

It would be much better if we could have faster algo which will have only single loop thru positions, even when order of permutations wouldn't be lexicographical.

In simple example like one above, it could seem that solution is easy. Just pick type of current member which is on index of selector (in buffer of remaining unused members). Then swap that member from buffer of remaining members. But this will introduce problem for multisets where length of set is aliqant (not divisor) of permutation potential. This work in my algo, because when all members of same type are together, decimal fragments of such division is always inside one group (type). But swapping of already used members with unused ones broke groups of members of same type. This swapping idea works for ranking of set, because all sub-sets have same potential.
0012

Above is example of problematic multiset. Note that on first position we are choosing 1/6 permutations, in my algo by scanning for offset of group. But for swapped buffer we should find constant time function which will give member type with correct distribution for all permutations of working buffer (because buffer state/permutation could be heritage of ranking wider multiset).

I know that algo could be probably made bit faster by swapping out empty slots from currentCounts[] array. But this will help just a bit and will broke lexicographical order, while it will complicate algo lot. I don't think it's worth of trying for mid-sized sets.

I'am interested to hear your ideas! And of course corrections if I missed something.

Clarification 13.4.2008:

Whole algorithm is not designed for permutating of small multisets, because such small ones could be easily precomputed and stored in memory for reuse. Let's talk about multiset of 2*A, 2*B, 2*C ... 2*J or something like that. This (2*10)! / ((2!)*10), a big space.
My algo could find one rank in O(10*2*(10/2)) = circa 100 operations. (some of them are quite big, but constant time)
Also notice that reasonable limit for size of space is 64bit or 128bit numbers, so that it could fit into CPU register and allow for constant time operations.
I'm looking for something faster in terms of computational complexity. Linear not quadratic.

Monday, October 8, 2007

GET single byte from big binary file using JavaScript and XMLHttpRequest

Full example and testing page

For FireFox, trick is to map bytes to user part of unicode.
(Idea by mgran)
xmlhttp.open("GET", fileName, false);
xmlhttp.setRequestHeader("Range", "bytes="+position+"-"+position);
if (navigator.userAgent.indexOf("MSIE") == -1)
{
    eval("xmlhttp.overrideMimeType('text/plain; charset=x-user-defined')");
}
xmlhttp.send(null)
return xmlhttp.responseText.charCodeAt(0) & 0xFF; 
For Explorer in VBScript, because it is possible to access responseBody in VB but is not possible to overrideMimeType in IE.
(Idea by mgran's reader)
Function readbyteIE(fileName, position)
Dim xhr
Set xhr = CreateObject("Microsoft.XMLHTTP")

xhr.Open "GET", fileName, False
xhr.setRequestHeader "Range", "bytes=" & position & "-" & position
xhr.send Null

readbyteIE=AscB(MidB(xhr.responseBody, 1, 1)) 
End Function
Credits to mgran and his readers

PS: I'm new in browser scripting. Please correct me if I missed something.

Wednesday, October 3, 2007

Playing with C++/CLI

Few things to note for comfort transition from C# to C++/CLI Pragmas
// use on base level, outside of namespace
#pragma unmanaged
#pragma managed

//Regions
#pragma region Helpers
#pragma endregion
Needed rotate, not just shift
//native
#define asm_ror(val, shift) __asm { ror val,shift }
#define asm_rol(val, shift) __asm { rol val,shift }

//managed
#define managed_ror(val, shift) val=((val>>shift)|(val<<(32-shift)))
#define managed_rol(val, shift) val=((val<<shift)|(val>>(32-shift)))
Nested namespaces
namespace Zamboch
{
 namespace Engine
 {
...
 }
}
Calling to native code and back to CLR
#pragma once

#include <vcclr.h>
using namespace System;
using namespace System::Runtime::InteropServices;


#pragma unmanaged

extern "C" { 
    typedef void* (__stdcall *myCALLBACK)(int smallIndex);
}

// static pointer to managed function
myCALLBACK fnGetDataPage;

void foo()
{
    //Calling back to CLR
    void* dtpage=fnGetDataPage(1);
}

#pragma managed

ref class Class1
{
public:
    // testing entrypoint
    static void Main()
    {
        //Call to native
        foo();
    }

    //callback implementation
    static void* GetDataPage(int smallIndex)
    {
        void* dtpage = NULL;
        return dtpage;
    }

private:
#pragma region Delegate initialization

    delegate void* myCALLBACKDelegate(int smallIndex);

    static GCHandle gch;

    //static constructor, initialize delegate here
    static Class1()
    {
        myCALLBACKDelegate^ fp=gcnew myCALLBACKDelegate(GetDataPage);
        gch = GCHandle::Alloc(fp);
        fnGetDataPage = static_cast<myCALLBACK>(Marshal::GetFunctionPointerForDelegate(fp).ToPointer());
    }

#pragma endregion

};

Tuesday, October 2, 2007

Ranking and unranking permutations of multiset

Few years ago I started to play with my toy Cube21 programaticaly. I will probably write another article about that later. For that toy needed to get index of permutation of multiset (set with repetitions). I searched internet and was unable to find any algorithm which could do that. Best match I have found those days was Myrvold&Ruskey's Ranking of permutation. Very nice algo, but I needed multiset, so I developed one.

So, now I'm playing with toy again and decided to share my ranking algorithm with community.

This is C# class with two methods which are inverse. First will give you index of permutation, second permutation on index. You should initialize instance of class for particular multiset. Order of permutations is lexicographic. Also note, that this implementation operates on int data type, so it should be rewritten for bigger permutations.

Myrvold&Ruskey were able to create linear time algo (with number of set members). I tried hard, but for multiset I was unable to reach better time than circa O(n*(t/2)), where n is number of members and t is number of types of members (or members without repetitions).

I'm interested to learn more about topic, so please share your thoughts. Or if you have faster algo, please, let me know.

There is also zip file with whole VS solution.

Update More detailed explanation.