Query-by-Sketch in Automatically Annotated Soccer Matches (Master Thesis, Finished)


Rufus Peter Lobo


IT technologies continuously find their way into sports. In Europe this can be especially observed in soccer.

 In the last years, leading IT companies like SAP have introduced systems (e.g., SAP Match Insights [1]) for performing offline queries in soccer matches. These systems enable coaches and players to search for specific scenes or events in a set of stored soccer matches. With SportSense [2] also the Databases and Information Systems (DBIS) research group of the University of Basel has developed such a system. SportSense enables the user to perform sketch-based queries including searches for events in a specific area, motion paths and event cascades.

 The potential use cases for such systems are numerous: Coaches can use these systems to discover weak points in existing tactics or to construct new tactics; players can reveal their own strengths and weaknesses in order to improve their performance; goalkeepers can analyze preferences of potential penalty takers. In fact, these systems are already in use at successful national teams and clubs. For instance, the German national team has used SAP Match Insights during the preparation for the FIFA World Cup 2014 they finally won [3].

 However, all these systems share one pitfall: They base on manual annotations. There are companies, such as OPTA, with employees who watch soccer matches and manually annotate events such as that there is a pass from a certain position to another position at a certain timestamp.

 The DBIS group has developed a real-time soccer analysis system [4] on top of the PAN middleware [5] for automating this annotation process. The main idea of this system bases on the fact that more and more professional soccer teams and soccer academies have started to equip their players and soccer fields with position measuring systems that emit the positions of the players and the ball in real-time as continuous sensor data streams. The system consumes these raw position data streams and analyzes them in multiple steps to detect complex (team) events, such as passes or offsite traps, and to generate complex (team) aggregates, such as ball possession statistics or heat maps. The results of these analyses, i.e., the detected events and generated aggregates, are then emitted as output data streams and visualized in a web client. Although the soccer associations are hesitant and do not allow the usage of such systems in regular matches of the national leagues (Premier League, Bundesliga, Primera Division, etc.) and the international competitions (EL, CL, world cup, etc.) yet, we suppose in consideration of the latest developments, such as the introduction of goal-line technologies (Hawk-Eye [6] and GoalControl [7]) and the FIFA quality program for tracking systems [8], that this will be possible in the near future.

 The real-time analysis and visualization system is a very helpful tool for improving the live feedback for the coaches during the match and for providing the television broadcast studios with live statistics. However, the current system generates and visualizes the analysis results only in real-time. So far, there is no support for storing the generated output data streams and for performing offline queries as those that are possible in SportSense.

 The goal of this Master Thesis is to solve this problem by means of developing a new system for performing offline queries that bases on automatic annotations (in form of data stream elements) instead of manual annotations. The new system has to store the raw position sensor data streams as well as the output data streams generated by the real-time analysis system in a database, enable the user to specify offline queries (as those supported by SportSense) by sketch and visualize the query results in an understandable and appealing way. More precisely, three tasks have to be solved in the course of this Master Thesis.

 First, the project includes the design of a database for storing the raw position data stream elements consumed and the output data stream elements generated by the real-time analysis system. For this purpose, first the workload characteristics (r/w ratio, etc.) and requirements (spatial query support, relational vs. schema-less, etc.) on the database have to be defined. Then, the database system which is optimal w.r.t. this definition, i.e., the database system that fits these characteristics and requirements the best, has to be selected. Moreover, storing the data stream elements in the database requires defining a schema that is generic enough to be extendable by new event/aggregate types but also specific enough to support queries. Furthermore, indices have to be specified to speed-up the query execution. In addition, scalability has to be taken into account in the design considerations as the new system has to be able to support queries on a huge set of stored soccer matches.

 Second, the project requires the implementation of a client that fetches all raw position sensor data stream elements and all output data stream elements from the real-time analysis system and that stores these data stream elements in the database. Fetching the data stream elements from the real-time analysis system must not harm the real-time guarantees of the analysis system. Similarly, the process of continuously adding new data stream elements should not affect the offline query system. That is, the system should still be able to perform offline queries on already stored matches while the data stream elements of the new match are being stored. Moreover, this should be possible without a remarkable loss in performance. Hence, mechanisms to perform both the fetching as well as the storing in a sophisticated and efficient way have to be elaborated.

 Third, the project necessitates the implementation of a web-based GUI. This GUI should enable the user to define offline queries by sketching in a similar way as in SportSense. These graphical queries should be automatically transformed to the query language of the database. Moreover, the results from the database have to be visualized for the user in an appealing way. An important requirement on the GUI is that it should be usable and understandable for domain experts like soccer coaches and players.

 The new system has to support all features of SportSense, i.e., searches for events in a specific area, motion paths and event cascades. In addition, new query types and result presentations have to be designed, implemented and evaluated. As the offline queries are in contrast to the real-time analyses not restricted w.r.t. their execution time, it is possible to define offline queries that perform analyses that are not possible during real-time due to performance reasons.

 In course of the Master Thesis preparation, the goals and tasks have to be defined in more detail in a written proposal. The proposal should also comprise limitations of the planned system and a concrete working plan (Gantt diagram).

  1. "SAP Performance Insights Solutions". Online: http://go.sap.com/solution/industry/sports-entertainment/match-insights.html (10.11.2016)
  2. Al Kabary, Ihab, and Heiko Schuldt. "Towards sketch-based motion queries in sports videos." Multimedia (ISM), 2013 IEEE International Symposium on. IEEE, 2013.
  3. "Youtube Video: DFB News from Brazil: Episode 1 Match Insights". Online: https://www.youtube.com/watch?v=RcqA3qqBaPc (10.11.2016)
  4. Brix, Frederik. "Complex Event Detection in Real Time Data Streams." Bachelor’s thesis. University of Basel (2016).
  5. Probst, Lukas, Ivan Giangreco, and Heiko Schuldt. "Pull-based Real-Time Complex Event Detection in Multiple Data Streams–the PAN Approach." Technical Report. University of Basel (2016).
  6. "Hawk-Eye Goal Line Technology". Online: http://www.hawkeyeinnovations.co.uk/products/ball-tracking/goal-line-technology (10.11.2016)
  7. "GoalControl". Online: http://www.goalcontrol.de/en/ (10.11.2016)
  8. "FIFA and IFAB to develop global standard for electronic performance and tracking systems". Online: http://quality.fifa.com/en/News/101/ (10.11.2016)

Start / End Dates

2017/02/01 - 2017/07/31


Research Topics