Michael michael-bmstu

Hi there, I'm Michael

ML Engeneer 🇷🇺

Interests: Mathematics 👨‍🎓, competitive Data Science 🥇, cooking 👨‍🍳 and boxing 🥊

📚 Education

Bachelor’s Degree in Circuit Design and Electronics

Institution: Bauman Moscow State Technical University (BMSTU)
Completion Date: 2024

🛠 Technology stack

Languages: Python, C/C++
Frameworks & Libraries: PyTorch, Torchvision, Transformers, Keras, Scikit-learn, CatBoost, OpenCV, Streamlit, Gradio, FastAPI
Tools & Platforms: VS-code, Git, Airflow, ML Flow, Docker, DVC, AWS MinIO S3

💻 Professional activity

Classic Machine Learning (ML)
Data Analysis (DA)
Computer Vision (CV)
Optical Character Recognition (OCR)
Prompt engineering
LLM engineering

💼 My projects and achievements

LLM IFRS financial reporting parser, 2025

Parser for pdf files of IFRS financial statements. LLM finds in the document the values of key indicators of IFRS reporting such as revenue, net profit, assets, capital.

Technology stack used: Fast API, Gradio, Langchain

Gradio interface for clustering recomendation system, 2025

Implementation of a web interface based on the Gradio library for a recommender system

Technology stack used: Fast API, Docker, DVC, Gradio, Scikit-learn, Pandas

Streamlit interface for Rucode 2024 task "Housing Issue", 2025

Implementation of a web interface based on a library Stramlit for solving the "Housing issue" problem of the artificial intelligence track of the RuCode 2024 festival.

Technology stack used: Streamlit, Scikit-learn, CatBoost, Pandas

RuCode: Housing Issue (2-nd place), 2024

This repository presents a solution to the task "Housing Issue" of the artificial intelligence track of the RuCode 2024 festival.

It describes the basic data manipulations (NaN filling, EDA) that were performed to achieve the best result, parameter selection and training of the CarBoostRegressor model.

Technology stack used: Scikit-learn, CatBoost, Pandas, Matplotlib, Optuna

DLS x ecom.tech workshop (20-th place), 2024

This repository presents the top 20 solutions to the problem of multilabel classification using the CatBoost (ML) and BERT (DL) models. The competition was held at the DLS and ecom.tech workshop.

Technology stack used: PyTorch, Transformers, CatBoost, Pandas, Matplotlib

Clustering recomendation system, 2024

One of the homework assignments for the 7-bit machine learning course was to develop a recommender system.

The notebook included cleaning data from outliers, forming a one-hot table for training the KMeans clustering model; training the KMeans model and selecting the hyper parameter k (number of clusters). Finally, a recommendation algorithm was implemented based on user ratings for each genre and viewing history.

Technology stack used: Pandas, Matplotlib, Scikit-learn

Pix2Pix with GAN, 2023

As part of the final project in the DLS part 1 course, a GAN model was developed to solve the pix2pix problem - changing the style of an image. The model changes the style of a face in an image into a comic.

The model inference is implemented via a telegram bot (currently not active), so you can start image processing yourself, having previously downloaded the weights. Just enter the token of your telegram bot in the bot.py file and run it.

Technology stack used: PyTorch, Torchvision, Matplotlib

RuCode: Vehicle Color Recognition (5-th place), 2022

This repository presents a solution to the task "Vehicle Color Recognition" of the artificial intelligence track of the RuCode 2022 festival.

I built a pipeline to train the ResNet101 model using photos of cars of different colors. As a result of training, the quality of the model reached the value = 0.9856 of the metric f1-score.

Technology stack used: PyTorch, Torchvision, Matplotlib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly