Darts Classification Problem

Will Charles
4 min readMar 26, 2021

I’ve been the captain of a darts team for a few years now and one of the things that I spend a lot of time thinking about is the roster of my team. A team is composed of between four and eight players and there are nine sets for which you need to select players in order to put together a roster. It would be easy enough to get a roster together with knowledge of the opposing team’s roster, but that information is not available at the time that a roster is constructed. I was looking to abstract the player away from the team by just looking at performance information for the team and the set regardless of who played the set. The target for this problem is set wins.

DartConnect is an app that allows for the scoring of different dart games and it also tracks and records individual and team performance. I’m pretty familiar with the app and the website, so that seemed like a good source for the data that I needed. Both BeautifulSoup and Selenium were necessary to scrape the website. Selenium was helpful in getting the URLs of specific matches, which could then be parsed with BeautifulSoup. Both of these tools are extremely useful. I scraped over 150,000 individual performances from my league, which was everything available on DartConnect.

The idea was to use a few features that were directly relevant to the outcome of the set, the first of which is a measurement of accuracy called three-dart average. This is just the number of points or marks that a player throws in a game divided by the number of darts that a player throws in a game times three. This is basically just how well a player does in an average turn. The second feature that I found to be of importance is sequence. Going first is a huge advantage, so players who have the opportunity to go first should be expected to be more likely to win. The last feature that I wanted to focus on was previous wins. Since every set is composed of 2 or 3 legs, I wanted to use the leg win percentage of the previous 5 games as a feature as well. This would potentially capture important information about performance since the ability to close games and get wins is a critical skill that may not closely follow the other features.

I decided to use five game moving averages for all of the previously mentioned features — three-dart average, game start percentage, and leg win percentage — and I started to build some models. Given all of the probabilistic calculations, logistic regression and a Bayesian model seemed like a good place to start. Additionally, I wanted to see the results of a KNN classifier. Ultimately, all of these performed reasonably well. When setting a threshold, I wanted the precision of the model to be above .87, which I felt would do a good enough job excluding predicted wins that were actual losses (false positives). The notion for this particular level was that if 5 set wins were predicted for a match (which is composed of 9 sets), there was an over 50% chance that those 5 sets would be wins and this seemed to be worth it even if there were many wins that were not predicted as such (false negatives) or if all of the rest of the sets were actual losses. Logistic regression and Gaussian naïve Bayes were ensembled in my final model with an average vote and I was able to hit the desired precision benchmark.

In retrospect, I might object to a few aspects of this project. First of all, I think that there was a much better way to set the problem up given what it was that I cared about. As mentioned in the introduction, what I wanted was a roster and I abstracted away everything that is useful for roster construction (players). This would complicate quite a few things as far as classes go, but I think that rank-ordering players and carefully defining dart accuracy and volatility would allow for a reasonably accurate roster prediction model. Unlike wins, roster predictions are definite although the reasoning behind rosters is opaque except for the general heuristic of what seems to be working. A captain will rationalize something that happened to work as a good choice even if a good outcome for that choice was unlikely. I would just want to have two steps for a roster classification problem. First, I would want to predict whether or not the player of the first set would change from the previous match. Second, I would want to predict the players of all of the solo sets and the first doubles set, given the prediction on the first set.

One of the problems with the model that I constructed is that the jump from predicting a win given historical data to a roster is not as obvious as I originally thought, but the other concern that I have is that there was some sort of leakage in my features. Although I was careful at each step, I think that the problem may have arisen in trying to restructure the data in a way that I wasn’t used to seeing, given my familiarity with match tables as viewed on the website

I think that this project helped reiterate the importance of asking an answerable question that is germane to whatever it is that you are addressing. Although it may have seemed like a more boring question to ask, just asking whether or not a team will change the player of the first set is immediately relevant to roster construction, so that may be the route that I would take if I were tackling this problem now.

--

--