Matching and clustering core for BIMLIB 2.0

We have developed the heart of BIMLIB 2.0 product using machine learning algorithms.
BIMLIB 2.0: CAD-independent platform which helps to estimate the cost of construction projects. 


I would like to thank you for your contribution to the development of BIMLIB platform for the comprehensive predictive assessment based on neural network technologies. BIMLIB is pleased to be partner of Apro and look forward to futher fruitful cooperation.

Anton Reshetnikov


Results Achieved:

  • Object matching with accuracy 95% based on object’s metadata
  • Development of model training mechanism and required API
  • Develop BIM class clustering algorithm


Python, ML / AI


BIMLIB 2.0 is a digital platform for quick calculation of building materials and equipment costs. This solution takes your building project created with any CAD solution and extracts meta data.

Next, based on machine learning algorithms, BIMLIB 2.0 connects specification of materials with BIM catalog. The same way, catalogs of building materials sellers are matched with BIM catalog.


As a result, building costs are calculated in a semi-automatic mode in a few minutes.


What the client required

The client required to develop a core of the product based ML/AI technologies, which can do matching of objects based on its description, metadata, attributes. We have been also requested to develop API for training and usage of the model on top of the core.


What we implemented for our client

The model solves the problems of placement, storage and access to the tree of products through the API, taking into account the non-trivial structure of the model.


The main elements of the model:

  • Classes – groups of products, united on the basis of related categories. An example: lamps.
  • Categories – specific product groups, e.g. table lamps.
  • Attributes – characteristics of a specific product item, e.g. supply voltage.

The model has the following characteristics:

  • Extensibility. The model can be extended by adding new categories and attributes without any limits.
  • Dynamism. The model has a dynamic class structure: the class structure of the model is updated each time the model is updated.
  • Uniqueness. The model eliminates duplication of classes, categories, and attributes of products and, as a result, of the products items themselves. This is achieved by the intellectual clustering of product items. New classes, categories, and attributes can be added to the model only if the clustering of a certain product is not possible for the current state of the model.
  • The model has a flexible structure and does not contain predefined immutable elements, due to this it allows the placement of any items.

Data clustering

Clustering data provides the extension/update of a generic data model with new classes, categories, product attributes and the relationships between them. Information about the products themselves is placed in the relational database associated with the model.

Clustering process includes several stages:

  1. Normalization of the names of the attributes: the description of each attribute (length, height, width, etc.) is transformed to the normal form (the standard or most general form within this attribute), taking into account all existing model attributes.
  2. Product placement in the model according to the normalized attributes, taking into account all existing categories of the model. In this case, the placement occurs in several of the most probable branches of the products tree, after which nonlinear concatenation of the results is performed, as a result of which the preferable branch is determined.
  3. Assignment of an object to certain classes of products based on the similarity of their attributes and attributes groups, taking into account all classes of the model.

Decisions about the need to expand the model due to new attributes, categories or classes are made in this module. Decisions for adding each element of the model are made independently.

The assessment of the belonging of attributes, categories and product classes and model elements is carried out due to the probabilistic ranking algorithm BM25.

Based on the same algorithm duplicates checks are performed, including the presence of implicit duplicates, which ensures the uniqueness of the model.

Improving the quality of the clustering result is achieved through the use of the Stable Matching algorithm (Nobel Prize in Economics, 2012).


Figure 1 shows:

N  – BimLib number of classes;

[a] – Stable matching algorithm, “Stable matching: Theory, evidence, and practical design”,



ML core developed by APRO allowed to create BIMLIB 2.0 product where users can operate with the digital specification, a document on the basis of which they can update data on the cost and availability of products within a few minutes (before it was done by a team of several people within a few weeks). Using machine learning algorithms, users can combine and compare catalogs of various structures, combine the manufacturer’s catalog with the supplier’s catalog and deliver to the designer’s classifier.

Let's Start Something new
Contact Us!

Contact us to get free consultation for your software development

Julia Shimanova

Julia Shimanova

I take care about our clients here in APRO. I would be happy to reply all your questions. Let's find the best possible solution for you together.

+375 293 299 632

6 + 4 =

Pin It on Pinterest