Interests: Mathematics ๐จโ๐, competitive Data Science ๐ฅ, cooking ๐จโ๐ณ and boxing ๐ฅ
Institution: Bauman Moscow State Technical University (BMSTU)
Completion Date: 2024
- Languages: Python, C/C++
- Frameworks & Libraries: PyTorch, Torchvision, Transformers, Keras, Scikit-learn, CatBoost, OpenCV, Streamlit, Gradio, FastAPI
- Tools & Platforms: VS-code, Git, Airflow, ML Flow, Docker, DVC, AWS MinIO S3
- Classic Machine Learning (ML)
- Data Analysis (DA)
- Computer Vision (CV)
- Optical Character Recognition (OCR)
- Prompt engineering
- LLM engineering
Parser for pdf files of IFRS financial statements. LLM finds in the document the values of key indicators of IFRS reporting such as revenue, net profit, assets, capital.
- Technology stack used: Fast API, Gradio, Langchain
Implementation of a web interface based on the Gradio library for a recommender system
- Technology stack used: Fast API, Docker, DVC, Gradio, Scikit-learn, Pandas
Implementation of a web interface based on a library Stramlit for solving the "Housing issue" problem of the artificial intelligence track of the RuCode 2024 festival.
- Technology stack used: Streamlit, Scikit-learn, CatBoost, Pandas
This repository presents a solution to the task "Housing Issue" of the artificial intelligence track of the RuCode 2024 festival.
It describes the basic data manipulations (NaN filling, EDA) that were performed to achieve the best result, parameter selection and training of the CarBoostRegressor model.
- Technology stack used: Scikit-learn, CatBoost, Pandas, Matplotlib, Optuna
This repository presents the top 20 solutions to the problem of multilabel classification using the CatBoost (ML) and BERT (DL) models. The competition was held at the DLS and ecom.tech workshop.
- Technology stack used: PyTorch, Transformers, CatBoost, Pandas, Matplotlib
One of the homework assignments for the 7-bit machine learning course was to develop a recommender system.
The notebook included cleaning data from outliers, forming a one-hot table for training the KMeans clustering model; training the KMeans model and selecting the hyper parameter k (number of clusters). Finally, a recommendation algorithm was implemented based on user ratings for each genre and viewing history.
- Technology stack used: Pandas, Matplotlib, Scikit-learn
As part of the final project in the DLS part 1 course, a GAN model was developed to solve the pix2pix problem - changing the style of an image. The model changes the style of a face in an image into a comic.
The model inference is implemented via a telegram bot (currently not active), so you can start image processing yourself, having previously downloaded the weights. Just enter the token of your telegram bot in the bot.py file and run it.
- Technology stack used: PyTorch, Torchvision, Matplotlib
This repository presents a solution to the task "Vehicle Color Recognition" of the artificial intelligence track of the RuCode 2022 festival.
I built a pipeline to train the ResNet101 model using photos of cars of different colors. As a result of training, the quality of the model reached the value = 0.9856 of the metric f1-score.
- Technology stack used: PyTorch, Torchvision, Matplotlib