Exploring Map and Agent Relationships - An Analytics Approach to VALORANT Matches



VALORANT is a free-to-play, online First-Person Shooter (FPS) developed and published by Riot Games. The primary game mode features two teams of five players battling against each other with firearms and abilities. The job of the Attackers is to plant an explosive device (the Spike) on the opposing team's site, or eliminate all the opposing players before time runs out. The job of the Defenders is to prevent the Attackers from planting the Spike either by eliminating all opposing players or by outlasting the Attackers before time runs out. In situations where the Spike is planted, the Defenders must defuse it in order to win. The first team to win thirteen cumulative rounds is the winner, with extra rules for sudden death and overtime when applicable.

Beyond the FPS genre, VALORANT also fits neatly into the genre commonly known as the Hero Shooter. Players have a roster of Agents with different kits and abilities to choose from at the start of every match, so the variations in team compositions can be drastic from match to match. This can lead to interesting situations where certain Agents play better off each other and on certain Maps.

The purpose of this inquiry is to determine whether there is a relationship between the Maps and Agents. Some Agents are better suited to certain angles, areas, and situations, and knowing those trends could help a player use them more effectively throughout their gameplay career.

The Data


The dataset I chose to work with was one I found through a Github repository, where the author had conducted similar research. What I received was a series of folders named after the Maps, and inside each Map folder was about twenty csv files that separated Agent data according to Tier.

My first goal was to get all of the data into a single csv file. That also required adding columns to the dataset to differentiate the nearly identical rows from csv to csv.
import pandas as pd
import os
path = os.listdir("/00/00/Desktop/agents_data/Haven")
for file in os.listdir(path):
    df = pd.read_csv(path + "/" + file)
    df["Map"] = (path.rsplit("/", 1))[-1].title()
    df["Tier"] = int((file.split("=")[1]).split(".")[0])
    df.to_csv(file, index=False)
    print(df)

#Note, the 0s in the path are there to obscure my system files.

This code pulls the Map name from the folder name and the Tier number from the file name, then adds those variables to a new column for each. The output was a csv file that combined all of the Competitive Tiers for each of the Map folders.

It worked alright, but only if I was willing to perform this operation on each of the Map folders individually. This method also meant that I still needed to combine all of the individual Map csv files into one. In other words, I did not fully automate this.

import pandas as pd
import os
maps = ["ascent", "bind", "breeze", "haven", "split"]
df_list = []
for map in maps:
    files = os.listdir("/Users/lukefraser/Desktop/agents_data/{}".format(map))
    for file in files:
        if "competitive" in file:
            df_temp = pd.read_csv("agents_data/{}/{}".format(map, file))
            df_temp = df_temp.drop(df_temp.columns[0], axis=1)
            df_list.append(df_temp)
df = pd.concat(df_list)
df.to_csv("df_combined.csv", index=False)
df.head(10)

With the help of a colleague, I was able to iterate through all of the folders and all of the files. The output of this code is a single csv file with all of the data combined. Just like in the previous code, it adds colums for Map and Tier.

import pandas as pd
import os
path = "/00/00/Desktop/agents_data/df_agents_combined.csv"
df_list = []
df = pd.read_csv(path)

for rows in df["Agent Name"]:
    if rows=="Jett" or rows=="Reyna" or rows=="Yoru" or rows=="Raze" or rows=="Phoenix":
        df_temp = "Duelist"
    elif rows=="Sage" or rows=="Cypher" or rows=="Killjoy":
        df_temp = "Sentinel"
    elif rows=="Astra" or rows=="Viper" or rows=="Brimstone" or rows=="Omen":
        df_temp = "Controller"
    elif rows=="Skye" or rows=="Breach" or rows=="Sova":
        df_temp = "Initiator"
    df_list.append(df_temp)
    
df['Role'] = df_list
df.to_csv("df_combined_v2.csv", index=False)
df.head(13)

After some initial time with the data, it became clear that I needed one extra column so the plots would make sense. Agents in Valorant have pre-defined roles that their abilities are best suited for. In order to account for these differences in the data, I wrote the code above. It iterates through the csv file and assigns the Agent's Role based on name, and then stores that value in a new column named Role.

import pandas as pd
import os
path = "/00/00/Desktop/weapons_data/df_val_wpns.csv"
df = pd.read_csv(path)
def unpercentage(column):
        df_list = []
        for i in df[column]:
            i = float(i.rstrip("%"))
            df_list.append(i)
        df[column] = df_list

unpercentage("Headshot")
unpercentage("Bodyshot")
unpercentage("Legshot")

The code above is a function I wrote in order to clean the Weapons dataset. Headshots, Bodyshots, and Legshots are represented by percentages, but those percentage signs made Seaborn read the data as if each value was a string. This function strips the % from each of the columns and turns all the values into floats.

Ultimately, I did not pursue any research questions pertaining to this dataset because it did not return any significant findings during initial exploration.

Data Visualization


With the data cleaned, it was time to make sense of it through visualizations.

import pandas as pd
import seaborn as sns
import os
import matplotlib.pyplot as plt 
agent_stats = pd.read_csv("/Users/lukefraser/desktop/agents_data/df_combined_v2.csv")
agent_stats.head()

The image above offers a glimpse into what the data looked like once the columns for Map, Tier, and Role were added.

sns.pairplot(agent_stats)

When confronted with the dataset for the first time, I had to ask myself how I might go about choosing the right columns for comparison. Kills or Kill-to-Death-Ratio (KD) seemed like natural choices, but those columns do not take Support Agents into consideration very well. Other factors such as Abilities and Kill Assists influence the outcome of a round, but neither Kills nor KD account for those.

The one column that does account for Kills, Deaths, Ability usage, and Assists is Average-Combat-Score (ACS). Without getting too deep into the technicalities of this statistic, it does provide insight into a player's overall performance. This is why, further down, this was the column I called upon to compare Agents.

Another column that stood out was Pick Rate. Players have perceptions about which Agents perform well and where, and I hoped the data might corroborate some of my own hypotheses.

Pick Rates

sns.set_style("darkgrid")
sns.set_context("paper")
plt.figure(figsize=(26, 6))     
sns.barplot(x="Agent Name", y="Pick Rate", data=agent_stats, hue="Map")

Visualizing categorical information (Agent Names) as facets of numerical data (Pick Rates) got messy with scatter plots. Using colors to represent either Agents or Maps in a scatter plot ended in a lot of noise that was impossible to read. I also had to choose one or the other categorical column, and I couldn't include both in a way that made sense. This is how I arrived at a bar plot for comparing Pick Rates of Agents across Maps. Using a Bar Plot also helped visualize the difference in Pick Rates from one Agent to the next especially in the instances where certain Agents are picked far less than others.

Nevertheless, this Bar Plot does not do justice to the nuances inherent within Agent Roles. Some Agents such as Duelists are picked higher than Sentinels, for instance, because they appeal to fast-pace and flashy gameplay. It was at this point that I realized I needed the Role column to compared Agents within each grouping.

plt.figure(figsize=(16, 8)) 
duelists = agent_stats.loc[agent_stats['Role'] == 'Duelist']
sns.barplot(x="Agent Name", y="Pick Rate", data=duelists, hue="Map")

Both Jett and Reyna have outsized Pick Rates compared to other Agents such as Phoenix and Yoru. To put it simply, Jett and Reyna both have abilities that allow them to take bad engagements at least once per round without getting punished for sloppiness. Jett can dash out of harm at will, while Reyna can either heal or become immune from damage following an elimination.

Based on this plot, players clearly think that Jett and Reyna are versatile on most Maps (especially the latter). Breeze, Ascent, and Haven all appeal to Jett's relationship to the Operator—the game's one-hit-kill sniper rifle—on either side of the Attack/Defense divide. This is not so true on Bind or Split, where tight angles, maze-like entry points, and verticality can impede a Jett wielding an Operator (especially on Attack).

Neither Phoenix nor Yoru have similar Abilities as Jett and Reyna, and, at least at the time that this data was gathered, their Abilities were easily countered.

One of most interesting aspect of this plot is Raze's low pick Rate for reasons that will be discussed in the ACS section. Clearly, players think she plays well on Split and Bind, both Maps where players can abuse Raze's explosive arsenal. Breeze—the largest and most open Map in this roster—is the least likely Map to play to her kit, which is probably why she has the lowest Pick Rate there.

plt.figure(figsize=(16, 8)) 
controllers = agent_stats.loc[agent_stats['Role'] == 'Controller']
sns.barplot(x="Agent Name", y="Pick Rate", data=controllers, hue="Map")

There are four striking aspects about this plot:
  1. Astra is heavily underplayed. This data is likely from when she had just launched and before she entered the meta.
  2. Viper is heavily overplayed on Breeze. This likely because her Toxic Screen ability is so useful for dividing up the large plant sites into manageable chunks.
  3. Omen is heavily used across the board. This data was likely pulled before he received a slew of nerfs in 2021. Unsurprisingly, people don't pick Omen as much on Breeze likely because his kit is not as strong on that Map as Viper's.
  4. Brim's Pick Rates are generally low, but players still choose Brim over Viper on Bind. This is likely because his kit is perceived as just as useful at covering entry ways on both Attack and Defense, if not more so.

  5. plt.figure(figsize=(16, 8)) 
    initiators = agent_stats.loc[agent_stats['Role'] == 'Initiator']
    sns.barplot(x="Agent Name", y="Pick Rate", data=initiators, hue="Map")
    

    Sova is by far one of the most versatile Agents, and his Pick Rates corroborate that. He is great for gathering information in almost any setting. The one Map that where he isn't as useful is Split, where narrow corridors and the maze-like sites allow opponents to evade his abilities on Attack or Defense.

    Skye is underpicked in this dataset except for on Breeze. Her Trailblazer and her Guiding Light abilities are useful for gathering location information about opponents on Attack and Defense. On a Map this large, that information gathering capacity is even more important.

    Breach is also underpicked, and it might be because the general community or the professional community had not learned how to use him at this time. Split, the Map with his highest Pick Rate, is one that definitely appeals to his kit. Breach is one Agent who flourishes on a Map such as Split where narrow corridors and maze-like sites force players into close quarters.

    plt.figure(figsize=(16, 8)) 
    sentinels = agent_stats.loc[agent_stats['Role'] == 'Sentinel']
    sns.barplot(x="Agent Name", y="Pick Rate", data=sentinels, hue="Map")
    

    Sage's influence on the Sentinel Pick Rates is too apparent to ignore. Equipped with healing, a slow ability, a wall, and resurrection, players seem to perceive her as the ultimate support agent. Both Cypher and Killjoy are much more technical to play, and their kits are much more costly in comparison to Sage.

    Killjoy's top 3 Maps make sense because they are all two-site Maps with many narrow corridors. On any of these Maps, she alone can hold a site with her utility long enough for her teammates to rotate in.

    Cypher is underpicked especially compared to Sage. His kit is less situational, and he is not as useful as Killjoy at holding a site on Attack once the spike is planted. Nevertheless, on Maps such as Bind and Split, his kit can catch opponents trying to lurk through his team's spawn point whether on Attack or Defense. This might be why those are his two most picked Maps.

    Average Combat Score

    plt.figure(figsize=(26, 8)) 
    sns.boxplot(x="Agent Name", y="ACS", data=agent_stats, hue="Map")
    

    As with the plot above for Pick Rate, the story of this data is not as evident when comparing all the Agents side-by-side. Role-specific plots are better for making these comparisons.

    In contrast to Pick Rates, where there were large discrepancies in values from Agent to Agent, the values for ACS are not as volatile. This is why I used box plots to compare Average Combat Scores. These plots also benefit from clearer insights into standard deviation as well as outliers.

    plt.figure(figsize=(16, 8)) 
    sns.boxplot(x="Agent Name", y="ACS", data=duelists, hue="Map")
    

    Despite low Pick Rates for Raze and Phoenix, they are viable Agents on many Maps. From the vantage of the median score, Raze performs just as well if not better than Jett on most Maps. The only Map where Raze doesn't do so well is Breeze, where Raze's utility is probably not suited for such wide open spaces.

    Phoenix does lag behind Jett, Reyna, and Raze, but his performance isn't the worst. He is one of the first Agents that players have unlocked when they first boot up the game, and his ACS across most Maps might reflect how flexible is kit is in most situations. Armed with flashes and healing abilities, he is a viable Agent on many Maps, but there is a limitation to his kit. Likely it is because his utility is easily countered.

    Yoru is historically the most underplayed character, and his ACS across all Maps suggests that his kit does not do well in many scenarios.

    Jett is consistently viable on every Map in this list. Likely the reason why she doesn't perform as well as Reyna is because she cannot heal herself.

    Unsurprisingly, Reyna dominates this chart. On many Maps her median ACS is 250 or higher. Some of the high-value outliers are also interesting. It suggests that Reyna is incredibly lethal on almost any Map at the control of the right player.

    plt.figure(figsize=(16, 8)) 
    sns.boxplot(x="Agent Name", y="ACS", data=initiators, hue="Map")
    

    Desipite Sova's dominance in Pick Rates, he does not outstrip his counterparts by a wide margin. If anything, his numbers are more consistent and aren't as volatile as Skye or Breach. It is surprising to see his performance on Split so high, despite his low Pick Rate on that Map.

    Skye's numbers are consistently high, even though the median scores do not ever exceed 200. The volatility of her scores on Ascent and Haven suggest that these are high-skill Maps for this Agent. Played poorly, a Skye could get punished on Ascent or Haven.

    Breach's performance is also quite volatile, with a standard deviation that dips quite low beneath the median. The one Map that stands out is Bind, where narrow corridors and tight laneways are great places for his kit. Nevertheless, his kit relies on timing. If executed at the wrong time, his abilities do not help his teammates. This plot suggests that Breach is a difficult character to master, regardless of the Map.

    plt.figure(figsize=(16, 8)) 
    sns.boxplot(x="Agent Name", y="ACS", data=controllers, hue="Map")
    

    Performances from Controllers are also quite even across the board, despite the variance in Pick Rates. Most surprising, Viper does not outperform her counterparts by much on Breeze despite how much she was favored among Pick Rates.

    The volatility in scores is the most interesting part of this plot. Omen is quite consistent on most Maps. Brim is less consistent, and Astra even less. Viper has some of the highest scores, proving that she performs well in most situations.

    plt.figure(figsize=(16, 8)) 
    sns.boxplot(x="Agent Name", y="ACS", data=sentinels, hue="Map")
    

    Among the Sentinels, Killjoy stands out as a top-tier Agent in spite of her low Pick Rate. Her median scores are mostly higher than Sage's, usually sitting at 200 ACS or more. This is likely due to how much chip damage Killjoy's turret and swarm grenades can deal round after round. The variance in datapoints on Breeze do suggest that Killjoy does not consistently perform well on big Maps. On a Map like Breeze, an Attacking team could easily evade her utility by using the wide open spaces.

    The upper and lower thresholds for Cypher suggest a lot of volatility in how people play him. It could be that Cypher isn't suited for one particular Map, but rather that he can be played well anywhere so long as the player understands Cypher's kit.

    Sage's numbers are also quite volatile. Her median scores are not high, but they aren't that low, either. Nevertheless, it appears that Sage is the better pick over Killyjoy or Cypher on Breeze.

    Discussion

    There is a clear distinction between the Agents that players pick for specific Maps in comparison to how these Agents perform in reality. ACS is not a perfect measure by any means, but it does provide some interesting insights. Assuming that Pick Rates reflect how players perceive the utility of certain Agents, it turns out that a lot of these Agents can perform well in many situations. For a few, there are some clear takeaways: Jett, Reyna, Sova, Killjoy, Omen, and Viper are clearly versatile on almost any Map with some exceptions. Yoru is just plain bad, and the rest are fairly average across the five Maps.

    This dataset reflects the casual meta. It is almost completely derived from the performances of the casual player base and spans the competitive ranks from the bottom to the top. Performing this analysis on data that is strictly derived from professionals or the two topmost ranks might create plots that are much more clear cut.

    This dataset is also limited by the time at which it was pulled. Without knowing the actual date, I would guess that the creator of this datset compiled it sometime around Spring 2021. Since then, Riot has introduced more Maps and Agents, Agents have been buffed and debuffed, and the meta has fluctuated. Using more current data could also provide some clearer insights.

    Aditionally, this dataset does not account for the learning curves of each Agent. Some are much more mechanically technical than others. I attempted to investigate that learning curve in the following, brief section.

    Competitive Tiers

    sns.set_context("talk")
    tiers = agent_stats.pivot_table(index="Agent Name", columns="Tier", values="ACS")
    sns.heatmap(tiers, cmap="Blues", linecolor="white", linewidth=1, yticklabels=1)
    

    Clearly, the top four duelists have the highest ACS on average at any tier. It is not suruprising to see Reyna perform so well at any level. It is surprising to see Raze perform just as well, especially at the top competitive tier, even if her Pick Rates are lower. Raze is a highly technical agent with incredible movement abilities but she cannot escape situations the way Reyna can. That she is played so well at the top tier compared to the rest below might suggest that her kit is easy to pick up but difficult to master.

    The same could be said of Viper and Cypher, whose scores at the top tier are the highest among all tiers.

    Despite their near ubiquitous Pick Rates, Sova and Jett's numbers taper off at the top tier. This might be because of saturation, or it might be because the players in these lobbies know how to counter those Agents.

    sns.set_context("talk")
    tiers = agent_stats.pivot_table(index="Agent Name", columns="Tier", values="KD")
    sns.heatmap(tiers, cmap="Blues", linecolor="white", linewidth=1, yticklabels=1)
    

    Much like in the ACS plot, Jett and Reyna have consistently high KD games across all tiers. Surprisingly, they aren't the top performers at the highest tier. That they don't have as high KDs at the top tier might be because they don't survive long enough to get consistently high KDs at this tier. Duelists like Jett and Reyna are expected to enter sites first and create space for the rest of the team to follow. At the top tier, players are more likely to play Duelists according to their role. In best case scenarios, the Duelists get multiple kills while securing the site. In other cases, they go down and get traded by their support Agents.

    Cypher, Killjoy, Raze, and Viper are the Agents with the highest KD ratios at the top tier, despite their low Pick Rates. This fits into the theory that Raze is an Agent that is easy to pick up but difficult to master. In the hands of a professional or a near professional, her rocket launcher ultimate has the potential to secure multiple eliminations with a single click.

    For Cypher, Killjoy, and Viper, I have a different theory. These Agents are only valuable so long as they are alive. As support Agents, it is expected of them to hang back and stay safe until the very end of the round. Their high KD scores could reflect a reality in which they are expected to clutch the end of a round against unfavorable odds (three-versus-one situations, for instance). They are also the Agents that are best suited for those situations, thanks to their utility.

    Final Remarks


    There is certainly more work that could be done. Different ways of visualizing the current data, pulling new data, or examining a subset of the data to examine only the professionals or near professionals. Conducting qualitative research such as a survey or a series of interviews could also provide a deeper insight into how players perceive the relationship between Agents and Maps.