12 June 2009

The best match problem

The best match problem is a very common problem that a programmer regularly encounters when working on audit/feedback programs. So I was quite happy to see the fourth problem on TopCoder I went through was one concerning best matches.

The problem however in this kind of exercise is that the quality of code is not the only important part: speed is important too, and thus sometimes you just simply tweak a bit around to gain some precious time. But it's interesting to see how you code in this kind of situation and this is a nice example of that. Let's see what it's all about.

A new online match making company needs some software to help find the "perfect couples".  People who sign up answer a series of multiple-choice questions. Then, when a member makes a "Get Best Mates" request, the software returns a list of users whose gender matches the requested gender and whose answers to the questions were equal to or greater than a similarity factor when compared to the user's answers.

So here is an example of what you can get and what you should return:

members =
{"BETTY F M A A C C",
"TOM M F A D C A",
"SUE F M D D D D",
"ELLEN F M A A C A",
"JOE M F A A C A",
"ED M F A D D A",
"SALLY F M C D A B",
"MARGE F M A A C C"}

currentUser="BETTY"

sf=2

The first element of each member is their name, followed by their gender ("M" or "F") and followed by the gender they are interested in ("M" or "F" again). Finally comes the answers of the member to a maximum of 10 questions ("A", ..., "D").

The currentUser is the user asking for its best matches among all the members, and the sf parameter is explained below.

The method returns a String[] consisting of members' names who have at least sf identical answers to currentUser and are of the requested gender.  The names should be returned in order from most identical answers to least.  If two members have the same number of identical answers as the currentUser, the names should be returned in the same relative order they were inputted.

So how would you code this? In Java, the first thought is to create a class named Member containing the name, the gender, the requested gender and the answers of each member so that it becomes handy to filter and order them out.

 private class Member {
  public String name;
  
  public String gender;
  
  public String requestedGender;
  
  public String[] answers;
  
  public int matchingScore;
  
  protected Member(String str) {
   String[] split = str.split(" ");
   this.name = split[0];
   this.gender = split[1];
   this.requestedGender = split[2];
   
   if (str.length() > 3) {
    answers = new String[split.length-3];
    for (int i=3; i<split.length; i++) {
     answers[i-3] = split[i];
    }
   } else {
    answers = new String[0];
   }
  }
 }

This class automatically fields its fields with the given string that is one given by the members parameter. Now, why would not the class sort itself out so you would only have to add the member to a Set and return it? Hum... the idea is nice but two problems arises:

  • How can you know the place of a member compared to another one if you do not have their matching score with the currentUser?
  • In the case two members have the same matching score, how do you know which one comes first (= which one comes first in the given members list)?

The answer I chose to this problem was to add two fields to the Member class: id and matchingScore. The tweak is then to calculate these values for each member before adding them to the returned Set.

The Member class thus becomes:

 private class Member implements Comparable<Member> {
  public int id;
  
  public String name;
  
  public String gender;
  
  public String requestedGender;
  
  public String[] answers;
  
  public int matchingScore;
  
  protected Member(int id, String str) {
   this.id = id;
   
   String[] split = str.split(" ");
   this.name = split[0];
   this.gender = split[1];
   this.requestedGender = split[2];
   
    if (str.length() > 3) {
    answers = new String[split.length-3];
    for (int i=3; i<split.length; i++) {
     answers[i-3] = split[i];
    }
   } else {
    answers = new String[0];
   }
  }
  
  public int compareTo(Member m) {
   if (this.matchingScore > m.matchingScore) {
    return -1;
   } else if (this.matchingScore < m.matchingScore) {
    return 1;
   } else {
    return new Integer(this.id).compareTo(new Integer(m.id));
   }
  }

And finally, the main method that processes the data (I will just copy/paste the whole class):

import java.util.ArrayList;
import java.util.List;
import java.util.Set;
import java.util.TreeSet;

public class MatchMaker {
 public String[] getBestMatches(String[] members, String currentUser, int sf) {
  List<Member> list = new ArrayList<Member>();
  for (int i=0; i<members.length; i++) {
   list.add(new Member(i, members[i]));
  }
  
  Member currentMember = null;
  for (Member m : list) {
   if (m.name.equals(currentUser)) {
    currentMember = m;
    break;
   }
  }
  
  Set<Member> bestMatches = new TreeSet<Member>();
  for (Member m : list) {
   if (!m.name.equals(currentUser)) {
    if (m.gender.equals(currentMember.requestedGender)) {
     int identicalAnswers = 0;
     for (int i=0; i<currentMember.answers.length && i<m.answers.length; i++) {
      if (currentMember.answers[i].equals(m.answers[i])) {
       identicalAnswers++;
      }
     }
     
     if (identicalAnswers >= sf) {
      m.matchingScore = identicalAnswers;
      bestMatches.add(m);
     }
    }
   }
  }
  
  Object[] objectMatches = bestMatches.toArray();
  String[] toArray = new String[objectMatches.length];
  for (int i=0; i<objectMatches.length; i++) {
   toArray[i] = ((Member) objectMatches[i]).name;
  }
  
  return toArray;
 }
 
 private class Member implements Comparable<Member> {
  public int id;
  
  public String name;
  
  public String gender;
  
  public String requestedGender;
  
  public String[] answers;
  
  public int matchingScore;
  
  protected Member(int id, String str) {
   this.id = id;
   
   String[] split = str.split(" ");
   this.name = split[0];
   this.gender = split[1];
   this.requestedGender = split[2];
   
   if (str.length() > 3) {
    answers = new String[split.length-3];
    for (int i=3; i<split.length; i++) {
     answers[i-3] = split[i];
    }
   } else {
    answers = new String[0];
   }
  }
  
  public int compareTo(Member m) {
   if (this.matchingScore > m.matchingScore) {
    return -1;
   } else if (this.matchingScore < m.matchingScore) {
    return 1;
   } else {
    return new Integer(this.id).compareTo(new Integer(m.id));
   }
  }
 }
}

The steps of getBestMatches() are:

  • create one Member object per member and associate an unique and increasing id for each,
  • identify the object corresponding to the currentUser,
  • for every member, if the member is a potential match (= is not the currentUser and is of the requested gender): count the number of matching answers and if the score is > sf, associate it to the member and add it to the Set to return,
  • return the Set.

There is a better way to do this. Writing an ID and the score in each Member isn't the nice way to do it, since each Member created this way would only be valid for a particular currentUser. However, I think this is one of the best ways to solve this problem in Java. We could have even returned the Set instead of an array so that the calling method would also get the scores of each member for the given currentUser.

No comments: