R Programming and Fight Metrics

Implementing R

With the basics covered in the previous section it was a matter of using the skills on a different dataset. I found one containing fight metrics for all UFC (Ultimate Fighting Championship) events for UFC1 in 1993 up to UFC Fight Night 83 in 2016 available at:


The data contained in the dataset was held in the following vectors

Variable Name Description
fid Fighter ID
name Fighter’s name
nick Nickname
birth_date Fighter’s date of birth
height Fighter’s height
weight Fighter’s weight
association Training camp
class Weight Class
locality City/County/State
Country Country of origin

Using this dataset the following graph was created giving a visual result of the number of fighters that have fought with the UFC by weight class. What we can see is that the most active weight classes were Welterweight, Middleweight and Lightweight. The least active weight classes were Atomweight, Super Heavyweight and Strawweight.

weightclass count

The other dataset used contained the following

Variable Name Description
pageurl event url
eid Event Id
mid Match ID
event_name Name of Event
event_org Organinisation name
event_date Event date
event_place Location of event
f1pageurl Fighter 1 url
f2pageurl Fighter 2 url
f1name Fighter 1 Name
f2name Fighter 2 Name
f1result Fighter 1 result
f2result Fighter 2 result
f1id Fighter 1 Id
f2id Fighter 2 Id
method Method of victory
method_d Method Description
round Round finished
time Time of fighter finished
ref Referee of fight

Using this data the below graph was created showing the number of time the fight ended in each of the different possible outcomes.


What we can see from this is that the majority of fights ended by going to judges’ decision followed by half as many fights ending by TKO (technical knockout) closely followed by submission victory and then under a third of the number of decisions is KO (knock out) finishes.

The following graphs show the percentage of fights finished by decision, KO and Submission from 1993 to 2016 (Decisions were not an outcome until later UFC events)


This shows that the number of fights going to decision has increased since the beginning of the events to almost 50% of fights since 2010 on wards


This graph shows the number of fights ending by KO has always been low fewer than 25% since the beginning


This graph shows that the number of submission in the beginning were about 75% this was because of fighters such as Royce Gracie who was a Brazilian jujitsu fighter who would fight other fighters who were looking to stand up and throw punches and kicks and he would win by taking the fight to the ground and neutralizing the other fighter. After the success at the beginning as more and more fighters learned submissions and submission defences the number of wins by submissions reduced below 25%

With these two datasets loaded it was then a matter of joining them together by fighterid and being able to look and see the amount of victories each fighter had, calculating the percentage of that and outputting the results.

name class fights wins win_perc_adj
 1: Jon Jones  Light Heavyweight 16 15 0.8142700
 2: Georges St. Pierre Welterweight 21 19 0.8116540
 3: Conor McGregor Featherweight 7 7 0.7636766
 4: Yoel Romero Middleweight 7 7 0.7636766
 5: Tony Ferguson Lightweight 11 10 0.7605096
 6: Anderson Silva Middleweight 19 16 0.7571829
 7: Don Frye Heavyweight 10 9 0.7457933
 8: Chris Weidman Middleweight 10 9 0.7457933
 9: Khabib Nurmagomedov Lightweight 6 6 0.7444223
10: Royce Gracie Middleweight 13 11 0.7334771

This gave the best overall fighters win percentage throughout the weight classes but if we wanted a breakdown for each then the result would be:

Class Name Fights Wins Win_perc_adj
1 Light Heavyweight Jon Jones 16 15 0.81427
2 Light Heavyweight Daniel Cormier 7 6 0.68834
3 Light Heavyweight Rashad Evans 19 14 0.67805
4 Welterweight Georges St. Pierre 21 19 0.811654
5 Welterweight Stephen Thompson 8 7 0.710175
6 Welterweight Warlley Alves 4 4 0.694669
7 Featherweight Conor McGregor 7 7 0.763677
8 Featherweight Jose Aldo 8 7 0.710175
9 Featherweight Max Holloway 14 11 0.697299
10 Middleweight Yoel Romero 7 7 0.763677
11 Middleweight Anderson Silva 19 16 0.757183
12 Middleweight Chris Weidman 10 9 0.745793
13 Lightweight Tony Ferguson 11 10 0.76051
14 Lightweight Khabib Nurmagomedov 6 6 0.744422
15 Lightweight Donald Cerrone 20 16 0.728364
16 Heavyweight Don Frye 10 9 0.745793
17 Heavyweight Cain Velasquez 13 11 0.733477
18 Heavyweight Junior dos Santos 14 11 0.697299
19 Flyweight Joseph Benavidez 13 11 0.733477
20 Flyweight Demetrious Johnson 13 11 0.733477
21 Flyweight Henry Cejudo 4 4 0.694669
22 Strawweight Joanna Jedrzejczyk 5 5 0.721752
23 Strawweight Tecia Torres 3 3 0.661745
24 Strawweight Valerie Letourneau 4 3 0.597335
25 Bantamweight Raphael Assuncao 8 7 0.710175
26 Bantamweight Dominick Cruz 4 4 0.694669
27 Bantamweight Aljamain Sterling 4 4 0.694669
28 Super Heavyweight Jon Hess 1 1 0.56874
29 Super Heavyweight Andre Roberts 3 2 0.553915
30 Super Heavyweight Scott Ferrozzo 5 3 0.544351
31 Atomweight Michelle Waterson 1 1 0.56874

Further development of these data sets could allow both statistical and predictive information such as:

  • Statically
    • Number of events each year,
    • Average fight times
  • Prediction
    • Likely results of fights taking into consideration fighters win loss records
    • How a fighter might win be it decision, ko or submission
    • Success rate of submission moves in ending fights
    • Spotting if there is any coalition between fighters and judges

***Interesting Fact***

Since the mid-1960s firefighter have added a wetting agent to make water wetter resulting in less friction in the hose cause the water to pass through it quicker

Leave a Reply

Your email address will not be published. Required fields are marked *