[Sticky] Update to problem 1 scoring
We have decided to update the scoring criteria for the first problem of the London Summer Challenge!
The underlying scoring metric will be the same but the scores you will see will be binary (i.e 0 or 10). To understand what both of these scores mean, read the following information:
1. A score of 0 means you are losing significant information when trying to reduce the number of features below 5000. (Perhaps try another approach)
2. A score of 10 means your feature set has at least some information useful in predicting the target variable.
Things to keep in mind:
1. A score of 10 doesn’t mean your method of picking features is really good. A score of 10 just means that your pick of features has at least some useful information. Use this as an indicator that this approach could yield decent results in problem 2.
2. Try not to pick a list of features that has a significantly lower count than 5000. Remember your primary goal in problem 1 is to keep as much information as possible, whilst getting the feature list down to 5000. Reducing the number of features significantly below this number without using some sort of transformation/feature engineering will likely mean you’re losing information.
3. The first problem is there help you get comfortable working with the toolbox and data. Our suggestion is to try out a few approaches and see if you can score 10. Once you are able to complete the first problem don’t spend too much time on it and move on to the second problem. The MSE scoring in problem 2 is a much stronger indicator of your model’s performance.
4. Keep in mind you might still need to improve the method of feature selection in problem 2 to be able to get a good MSE score.
Why have we made this update?
We monitor every single competition to make sure that the problems and scoring are fair to all users, and that they help users produce the best answers. We felt that the current scoring did not live up to these standards. Specifically, we felt:
1. Users picking a significantly lower number of features than 5000 are at a disadvantage. For instance, a user picking just 10 features would have a much lower overlap with our list of useful columns, even if those 10 columns had an r^2 = 0.99.
2. We don’t want users to spend too much time on the first problem trying to improve their score. While we don’t want users to discard useful information, the goal of the competition is not to match our pick of features as closely as possible, the aim is to finally produce a model with the highest MSE score.
3. The features themselves are correlated. A method which scores slightly better in picking the fraction of useful features isn’t necessarily the better method, because the same information could be present in the other features. Depending on the model used in the second part of the problem users can compensate for this.