Content-Based News Recommendation System

A content-based recommendation system uses natural language processing and machine learning to suggest articles to users based on the content they have previously read.

Project Overview

This project implements a content-based recommendation system that fetches news articles from The Guardian API, processes them using machine learning techniques, and serves personalized article recommendations through a Streamlit web app. The app allows users to log in, read articles, like or dislike them, and read recommendations.

Project Workflow

Environment Setup:
- Configuration of a Python environment using Visual Studio Code.
- Installation of necessary packages, including pandas, scikit-learn, Streamlit, and nltk for natural language processing.
Data Collection:
- Fetch news articles using The Guardian API. You can do this through the [Notebook](./Data%20Fecthing%20The%20Guardian.ipynb!
- Store articles' metadata and content for further processing.
Data Processing:
- Clean and preprocess the article data, focusing on key fields like title, body, and publication date.
- Use TF-IDF (Term Frequency-Inverse Document Frequency) to vectorize the text content.
- Calculate cosine similarity between articles to determine similarity.
- Use nltk to include synonyms in the search functionality, enhancing the search experience.
Recommendation System Development:
- Implementation of a content-based filtering approach to recommend articles similar to those previously read by the user.
- Allow users to provide feedback (like/dislike) to improve future recommendations.
Deployment and User Interaction:
- Serve the model through a Streamlit app.
- Provide a user interface for searching articles, viewing recommendations, and tracking reading history.
- Include user login functionality, allowing multiple users to maintain separate preferences.

Technologies Used

Python: Main programming language for the project.
Streamlit: For building the interactive web application.
The Guardian API: For fetching news articles.
pandas & NumPy: For data manipulation and analysis.
scikit-learn: For machine learning tasks, including TF-IDF vectorization and cosine similarity.
nltk: For natural language processing, including handling synonyms.
Matplotlib: For visualizing user feedback statistics.

Project Goals

Demonstrate the ability to build and deploy a content-based recommendation system.
Provide a user-friendly web application for interacting with the recommendation system.
Implement user feedback mechanisms to refine and personalize recommendations.

How to Run (Local)

Clone the repository to your local machine.
Set up the environment with the necessary packages as described.
Fetch the latest articles from The Guardian using the provided scripts.
Run the Streamlit app to start the recommendation system.
Explore, search, and get recommendations based on your reading history.

Future Work

Enhance the model by incorporating user feedback (likes/dislikes) into the recommendation algorithm.
Explore additional machine learning algorithms for improving recommendations.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.streamlit		.streamlit
data		data
users/data		users/data
Data Fecthing The Guardian.ipynb		Data Fecthing The Guardian.ipynb
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Content-Based News Recommendation System

Project Overview

Project Workflow

Technologies Used

Project Goals

How to Run (Local)

Future Work

About

Releases

Packages

Languages

EmanuelNovelo/content-based-recommendation-system

Folders and files

Latest commit

History

Repository files navigation

Content-Based News Recommendation System

Project Overview

Project Workflow

Technologies Used

Project Goals

How to Run (Local)

Future Work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages