Skip to main content
To KTH's start page To KTH's start page

Automatic data collection and modeling

Modeling IT architecture is a complex, time consuming, and error prone task. However, many systems produce information that can be used in order to automate modeling. Early studies show that this is a feasible approach if we can overcome certain obstacles. Often more than one source is needed in order to cover the data requirements of an IT architecture model and the use of multiple sources means that heterogeneous data needs to be merged. Moreover, the same collection of data might be useful for creating more than one kind of model for decision support.

IT architecture is constantly changing and data sources provide information that can deviate from reality to some degree. There can be problems with varying accuracy (e.g. actuality and coverage), representation (e.g. data syntax and file format), or inconsistent semantics. Thus, integration of heterogeneous data from different sources needs to handle data quality problems of the sources. This can be done by using probabilistic models. In the field of truth discovery, these models have been developed to track data source trustworthiness in order to help solving conflicts while making quality issues manageable for automatic modeling. 

We build upon previous research in modeling automation and in this project we propose a framework for merging data from multiple sources with a truth discovery algorithm to create multiple IT architecture models. The usefulness of the proposed framework is demonstrated in a study where models using three tools are created, namely; Archi, securiCAD, and EMFTA.

Most files you need in order to get started with the framework is available here. 

Link to DropBox, sharing files