Automated Schema Discovery on Heterogenous Data Stores (Bachelor Thesis, Ongoing)

Author

Roman Ostermiller

Description

Polypheny currently supports a wide range of data stores, making it an excellent tool for data science applications. However, while connecting to a data source is relatively straightforward, the process can be tedious since users must manually specify all entities to be mapped into Polypheny’s schema -- a task that typically involves first connecting to the data store and then examining its schema.

This project aims to streamline the process by introducing an automated schema discovery feature into Polypheny. When a user connects to a data store, Polypheny will automatically extract and display the current schema, allowing the user to easily select which entities should be incorporated into Polypheny’s logical schema.

In addition to developing the necessary components for schema recognition and presentation, the project also involves expanding the range of data store adapters for JDBC-supported systems. Furthermore, the candidate will investigate and implement schema discovery methods for non-relational data stores.

Start / End Dates

2025/03/17 - 2025/07/16

Supervisors

Research Topics