AI Project Reporting Example
“THE CATS AND DOGS Classification”
Each project requires adequate reporting. Here, in APRO, we know how important it is for our clients and partners and for the success of the project itself.
A good report is an essential part of the OpenX method: it is a way to share information, keep our clients updated and establish trust.
Let me show you an example of a Project State Report for an abstract machine learning project about the classification of cats and dogs.
The goal of the sprint: to reduce the level of misclassification by 5%.
Period: 13 Apr 2021 – 24 Apr 2021.
The results have been obtained with a new ANN (artificial neural network) model, which was trained on the new data set. The data set is balanced and consists of:
- 4000 augmented examples of cats (from 500 original images);
- 4000 augmented examples of dogs (from 500 original images);
The implementation of the model allowed to decrease the misclassification by 6.5% (from 12.2% to 5.7%) by the Project metric.
During the implementation, we faced the issue of overfitting. We plan to solve the issue during the next sprint. Thus the increase in the level of quality will exceed 4%.
The description of the work done in a more detailed way you can find below.
Process: New ANN Model
The new ANN model (4) was integrated into the updated structure of the Cats and Dogs Detection System (see the image below).
The current version of the system consists of the following blocs:
- – the input data layer (API wrapper – to get an image via the API);
- – the input image color normalization module;
- – the input image geometric distortions normalization model;
- – the new cats and dogs classification model;
- – the visualization module;
- – the output result layer (API wrapper – to send an image via the API).
Thus, the model has a unique custom architecture and well-fitted for the business needs of the project.
During the sprint, we faced the issue of the new model has overfitting: the quality of results during the training was better than for evaluation.
This issue did not influence the goals of the sprint.
We think the reason for the issue is:
- In the volume of data set, because of the lack of data augmentation. We still do not use all the possible augmentation approaches.
- In non-optimal macroparameters of the model, because we have not done deep optimization for them.
The possible solution to improve data augmentation algorithm, that should allow increasing the number of examples to at least 8000 items per class and to make a deep optimization for the macro parameters of the model.
The main result of the sprint:
- The new ANN classification model was developed in embedded in the system.
- The use of the new AI data processing model allowed to decrease the misclassification by 6.5% to 5.7% by the Quality Metric.
The goal of the sprint was successfully achieved. But the new model has overfitting: the quality of results during the training was