Real time player behavior analysis

This page is work-in-progress.

This page details the background of a possible Machine Learning based real time monitoring of players behavior implementation in Legacy mod. See also #1021 for implementation details.

References

Background

See the Original discussion:

RE: Anticheating mechanisms - Added by lrq3000

Hey guys,

It’s great you implemented the antiwallhack system of Laszlo, it’s really the best way to prevent wallhacks.

About aimbots, I worked on an opensource anticheat architecture, completely server-sided (so that cheaters can’t avoid it), based on behavior analysis automated by machine learning.

There are several research papers about automated behavior analysis to detect cheaters, it works well. My architecture goes beyond simply doing behavior analysis by providing a full architecture to produce the necessary behavior data from servers automatically, then ease the analysis (using any algorithm you want), the sharing of behavior analysis parameters (so that server administrators can share the parameters or even the behaviors databases in order to construct big datasets to get better parameters) and finally allows for ban/kick/demo recording/any rcon command as a final measure.

The whole project is here, made for OpenArena, but easily translatable to any ioq3 based game:

https://github.com/lrq3000/openarena_engine_oacs
https://github.com/lrq3000/oacs
That being said, I stopped the project because of a lack of time, but it fully works. The only downfall is that the algorithms I used are not efficient enough. I plan to add support for scikit-learn, leveraging the big library of machine learning algorithms scikit-learn offers. Also, better behavior features could provide better results, such as a better reaction time estimator and a crosshair estimator (is the crosshair aiming at a player’s head? for how long? etc.).

This approach was implemented by the mod ExcessivePlus (along with an enhanced antiwallhack but stemming from the same Laszlo patch). If anybody is interested into reviving this project, I can supervise the effort and provide the update to support scikit-learn.

To clarify what is the goal of such a system: with behavior analysis, you analyze what are "normal" players behavior (non-cheaters), and from that you can detect "anomalous" behaviors (cheaters). This has the big advantage that you just need to get data from normal players, which is readily available. It will also be robust enough to detect new cheats (because we don’t model the cheats anyway but the non-cheat).

In the end, cheats can still work, but only if they do not display "anomalous" behavior, such as a too high precision or unnatural human movements or reaction time. So in the end, such cheats are not a threat anymore, since they do not give a "surnatural" power, because they simply are not allowed to trespass the "natural ability" threshold.

Of course that’s in theory, and in practice there will be several false positives. But the goal of the system is NOT to auto-ban or auto-kick (even if it is possible), but rather to autodetect suspicious behavior, and launch an autorecord of them, for later review by the server administrator. This would also be the most efficient, as delaying the punition will reduce the clues cheaters can have (indeed if they get kicked instantly, they can iteratively refine their cheats, whereas if the punition is delayed, they can’t know what version of their cheat did that).

Finally, here are english slides about the whole project and how it works: https://github.com/lrq3000/oacs/blob/master/doc/PIAD34-OACS-1-en.pdf

You can ask me here if you want more details on one point. I’m convinced this is the best approach for an opensource anticheating system, as it will leverage the collaborative nature of opensource, because:

1. you can share datasets and parameters, everything is anonymized, the ids are not necessary for the behavior analysis).
2. having the database/parameters does not provide the program (since it is learnt by a machine), and even if learnt, it cannot be bypassed (since everything is analyzed server-side). This is why this approach is also being investigated by AAA studios currently such as Battlefields.


Wow! Now that is interesting!

It’s also been a while I have been convinced the best approach to counter or reduce cheating is based on statistical learning, and creating such a system has been on my mind for quite some time. I am also quite familiar with ML techniques and scikit-learn/tensorflow, so your work is definitely very relevant to this project, thanks! I’ll have a closer look asap, expect to heard from us very soon!


Great, I’m very eager to continue this project, but alone it’s been very difficult to complete it

You might also be interested in looking into a similar project, whose author contacted me a few months ago to merge with my project: https://github.com/The-Skas/Anti-Cheat

His approach was oriented towards CounterStrike-GO, but it is very similar, except that it uses a new machine learning algorithm based on hidden markov model.


Tracked in #1021.

I’ve had a look at the provided documentation and had a quick overview of the interface code, and I can say that I like what I see.
In particular, I love the flexibility provided, and that the ML part is done entirely in python and can be easily visualized with Jupiter notebook. I honestly don’t have many questions as I’m quite familiar with ML concepts already.
Using a multivariate Gaussian/clustered augmented Gaussian algorithm is a good approach, but I guess there are quite many possibilities that could be tested, including "deep learning" techniques.

Your slides seem to show good results detection, so maybe you could expand on what is actually not working well. Or to reformulate it: "Why don’t I see bad results in your slides?"

Also, reading https://github.com/ioquake/ioq3/pull/265 I guess this could also be used to analyze server demo. If I’m not wrong, our implementation is based on your own code or at least very similar.
I am also not sure if the work done by The-Skas could be reused somehow, as CS pattern recoil system is quite specific (each weapon has always the same recoil pattern), but I haven’t had a closer look at the provided code.
Adding or improving support for scikit-learn would be definitely useful here.


Hey Spyhawk,

Sorry about the late reply, I intended to answer much earlier but I had a very big project to finish for my IRL job

Thank you very much for your interest! I’m also excited to work again on this project

So I can start working on the project again beginning from middle of June onwards (until probably the end of the year, which should be way more than enough if I have collaborators to work with).

What I need help with:
porting the data recorder from ioquake3+OA to et-legacy (should not be too hard, the data recorder is implemented similarly but even more standalone than my server-side demo patch).
add new behavioral features. I’m talking here about machine learning features, I guess you see what I mean In particular, I think the anti-wallhack routines could be reused to generate new features, such as knowing if player A is aiming at player B, ever better would be the exact localization such as middle body part, head part, etc. Which was not possible in ioquake3.
Another idea of a feature that I think would be powerful and unavoidable: get target player ID (when aiming or shooting) so that we can join target player’s infos, such as: target player speed, target player distance, etc.
setup a server to test data collection and anti-cheat detection (I have a set of custom ioquake3 aimbots to test but I’m not sure they will work on ET, but anyway we need data from real players too!).
Then on my side, but help is welcome too!
Variables combiner: to create new variables as linear combinations of several other variables, eg: to allow to combine target player moving x speed, I expect anticorrelated with time aiming at body with big outliers for aimbots.
frames aggregator: currently, each sample for the ML algo = one timeframe, we could aggregate and average over several timeframes to get more smoothed and robust predictors (this is the most common approach used in scientific studies on anti-cheating systems based on behavior).
bridge to scikit-learn, should be easy using sklearn-pandas (https://github.com/paulgb/sklearn-pandas), which was not available at the time I created this project.
automated cross-validation. There is already a ROC curve and a confusion matrix generator, and cross-val is partially implemented in the notebook, I would just like to automate it (and make it compatible with scikit-learn).
discrete variable expander: a module to expand discrete variables into as many variables as there are values, this should greatly enhance the performance for classifiers and non-linear models.
unit testing (not 100% but at least some to ensure coherency across versions).
(self-reminder: merge author-detector core into OACS to include structural coherency constraints ala colored petri networks).
That’s all I planned for now. With only these few features added, I expect the accuracy to increase substantially, which is the only weak point of the system for now, so that we get a robust working core that can quickly be used by server administrators

Feel free to propose stuff if you have any other idea of things we should add to the core (or later as additional modules)!


Just additional notes here to answer your side questions

Using a multivariate Gaussian/clustered augmented Gaussian algorithm is a good approach, but I guess there are quite many possibilities that could be tested, including "deep learning" techniques.

Yes, the multivariate gaussian was a first generic approach, but there are much more clever algorithms nowadays, it should be easy to try them with scikit-learn. Also I might have a look if it’s possible to use deep-learning frameworks such as Theano.

Your slides seem to show good results detection, so maybe you could expand on what is actually not working well. Or to reformulate it: "Why don’t I see bad results in your slides?"

Because it was for a university project, I tried to show mostly the good sides to get a good mark ;-p But you can see the detection is not great by looking at the scatter plots, the data is not linearly separable. I did not hide it, just minimized it for presentation purpose But everything that is present in the slides is also present in the software, everything works!

Also, reading https://github.com/ioquake/ioq3/pull/265 I guess this could also be used to analyze server demo. If I’m not wrong, our implementation is based on your own code or at least very similar.

Yes it’s my implementation of server-side demos that is used in et-legacy, but I don’t think we could analyze demos for anti-cheats, because demos do not reproduce all players commands. In my server-side demos patch, there is a commented part that allow to record all these infos, but it makes demos so big this is hardly useful in practice... But it should be possible.

I am also not sure if the work done by The-Skas could be reused somehow, as CS pattern recoil system is quite specific (each weapon has always the same recoil pattern), but I haven’t had a closer look at the provided code.

About The-Skas, in fact he already contacted me by email a few months ago, he offered to merge his project into mine, since my framework is more general purpose, while his is an implementation of a specific algorithm (which seems to work well!). For the moment it’s a bit stalled, but we might do that in the future


Ps, i forgot to say, if the results are good enough, i will write and publish a scientific paper in a peer reviewed journal, and all contributors will be co-authors I really intend this project to be collaborative, from the core to the end!


Glad to hear back from you!

I’m myself quite busy at the moment, but just drop a post at mid-June so we can plan and coordinate our effort.

Because it was for a university project, I tried to show mostly the good sides to get a good mark

I kinda knew it haha

if the results are good enough, i will write and publish a scientific paper in a peer reviewed journal, and all contributors will be co-authors

Quite interesting!


Ok perfect, I’ll post in mid June BTW i did thes project at the beginning of my studies in AI and machine learning, it’s no wonder i did not get great results, that’s why I’m pretty sure we can do a lot better, particularly if we are working together and now that i finished my studies and worked on several other projects!