Quran Foundation API Scraper (Chapters, Verses, Translations, and Transliterations)

A Node.js application that scrapes Quran verses data from the Quran.com API (v4), including translations in multiple languages (Indonesian and English).

Features

Fetches all 114 chapters of the Quran
Includes word-by-word translations and transliterations
Supports multiple language translations:
- Indonesian (id)
- English (en)
- Note: For word-by-word translations, Malay translations currently fall back to English
Includes Arabic text in both Uthmani script and Tajweed format
Implements rate limiting to avoid API overload
Saves data in structured JSON format

Prerequisites

Node.js (v14 or higher) or Bun runtime
Basic understanding of JavaScript/Node.js
Internet connection for API access

Installation

Using npm

npm install

Using Bun

bun install

Configuration

The following constants can be modified in index.js to customize the scraping behavior:

const BASE_URL = "https://api.quran.com/api/v4"; // API endpoint URL
const OUTPUT_DIR = "./chapters"; // Output directory for JSON files
const MAX_CHAPTERS_COUNT = 114; // Total number of chapters to fetch

Additional configurable parameters in the API requests:

per_page: Number of verses per page (default: 50)
translations: Translation resource IDs (default: "39,131,33" for English, English, and Indonesian)
language: Language for word translations

Usage

Run the scraper:

npm start

bun start

The script will:

Create a chapters directory if it doesn't exist
Fetch all chapters sequentially (1-114)
Save each chapter as a separate JSON file (e.g., 1.json, 2.json, 114.json)
Display progress information in the console

Example directory structure after running the script:

chapters/
├── 1.json  # Al-Fatihah
├── 2.json  # Al-Baqarah
├── 3.json  # Ali 'Imran
...
└── 114.json # An-Nas

Example console output while running:

Starting Quran verses scraping...

==================================================
Starting Chapter 1
--------------------------------------------------
Fetching page 1 of chapter 1 of 114
Fetched 7 verses for chapter 1
Saved chapter 1

Chapter 1 completed in 0.82 seconds

==================================================
Starting Chapter 2
--------------------------------------------------
Fetching page 1 of chapter 2 of 114
Fetched 286 verses for chapter 2
Saved chapter 2

Chapter 2 completed in 2.15 seconds

==================================================
Starting Chapter 3
--------------------------------------------------
Fetching page 1 of chapter 3 of 114
Fetched 200 verses for chapter 3
Saved chapter 3

Chapter 3 completed in 1.53 seconds

...

==================================================
Starting Chapter 114
--------------------------------------------------
Fetching page 1 of chapter 114 of 114
Fetched 6 verses for chapter 114
Saved chapter 114

Chapter 114 completed in 0.65 seconds

Scraping completed!

Output Format

Each chapter is saved as a JSON file with the following structure:

{
  "chapter": {
    "id": number,
    "revelation_place": string,
    "revelation_order": number,
    "bismillah_pre": boolean,
    "name_simple": string,
    "name_complex": string,
    "name_arabic": string,
    "verses_count": number,
    "pages": number[],
    "translated_names": {
      "ms": string,
      "id": string,
      "en": string
    }
  },
  "verses": [
    {
      "id": number,
      "verse_number": number,
      "verse_key": string,
      "text_uthmani": string,
      "words": [
        {
          "text_uthmani": string,
          "text_uthmani_tajweed": string,
          "translations": {
            "ms": string, // Note: Currently using English translations
            "id": string,
            "en": string
          }
        }
      ],
      "translations": {
        "ms": string,
        "id": string,
        "en": string
      }
    }
  ]
}

Example Chapter JSON Output

{
  "chapter": {
    "id": 1,
    "revelation_place": "makkah",
    "revelation_order": 5,
    "bismillah_pre": false,
    "name_simple": "Al-Fatihah",
    "name_complex": "Al-Fātiĥah",
    "name_arabic": "الفاتحة",
    "verses_count": 7,
    "pages": [
      1,
      1
    ],
    "translated_names": {
      "ms": "Pembukaan",
      "id": "Pembukaan",
      "en": "The Opener"
    }
  },
  "verses": [
    {
      "id": 1,
      "verse_number": 1,
      "verse_key": "1:1",
      "hizb_number": 1,
      "rub_el_hizb_number": 1,
      "ruku_number": 1,
      "manzil_number": 1,
      "sajdah_number": null,
      "text_uthmani": "بِسْمِ ٱللَّهِ ٱلرَّحْمَـٰنِ ٱلرَّحِيمِ",
      "page_number": 1,
      "juz_number": 1,
      "words": [
        {
          "id": 1,
          "position": 1,
          "audio_url": "wbw/001_001_001.mp3",
          "char_type_name": "word",
          "text_uthmani": "بِسْمِ",
          "text_uthmani_tajweed": "بِسۡمِ",
          "page_number": 1,
          "line_number": 2,
          "text": "بِسْمِ",
          "translation": {
            "text": "In (the) name",
            "language_name": "english"
          },
          "transliteration": {
            "text": "bis'mi",
            "language_name": "english"
          },
          "translations": {
            "ms": "In (the) name",
            "id": "dengan nama",
            "en": "In (the) name"
          }
        },
        {
          "id": 2,
          "position": 2,
          "audio_url": "wbw/001_001_002.mp3",
          "char_type_name": "word",
          "text_uthmani": "ٱللَّهِ",
          "text_uthmani_tajweed": "<rule class=ham_wasl>ٱ</rule>للَّهِ",
          "page_number": 1,
          "line_number": 2,
          "text": "ٱللَّهِ",
          "translation": {
            "text": "(of) Allah",
            "language_name": "english"
          },
          "transliteration": {
            "text": "l-lahi",
            "language_name": "english"
          },
          "translations": {
            "ms": "(of) Allah",
            "id": "Allah",
            "en": "(of) Allah"
          }
        },
        {
          "id": 3,
          "position": 3,
          "audio_url": "wbw/001_001_003.mp3",
          "char_type_name": "word",
          "text_uthmani": "ٱلرَّحْمَـٰنِ",
          "text_uthmani_tajweed": "<rule class=ham_wasl>ٱ</rule><rule class=laam_shamsiyah>ل</rule>رَّحۡمَ<rule class=madda_normal>ـٰ</rule>نِ",
          "page_number": 1,
          "line_number": 2,
          "text": "ٱلرَّحْمَـٰنِ",
          "translation": {
            "text": "the Most Gracious",
            "language_name": "english"
          },
          "transliteration": {
            "text": "l-raḥmāni",
            "language_name": "english"
          },
          "translations": {
            "ms": "the Most Gracious",
            "id": "Maha Pengasih",
            "en": "the Most Gracious"
          }
        },
        {
          "id": 4,
          "position": 4,
          "audio_url": "wbw/001_001_004.mp3",
          "char_type_name": "word",
          "text_uthmani": "ٱلرَّحِيمِ",
          "text_uthmani_tajweed": "<rule class=ham_wasl>ٱ</rule><rule class=laam_shamsiyah>ل</rule>رَّح<rule class=madda_permissible>ِي</rule>مِ",
          "page_number": 1,
          "line_number": 2,
          "text": "ٱلرَّحِيمِ",
          "translation": {
            "text": "the Most Merciful",
            "language_name": "english"
          },
          "transliteration": {
            "text": "l-raḥīmi",
            "language_name": "english"
          },
          "translations": {
            "ms": "the Most Merciful",
            "id": "Maha Penyayang",
            "en": "the Most Merciful"
          }
        },
        {
          "id": 5,
          "position": 5,
          "audio_url": null,
          "char_type_name": "end",
          "text_uthmani": "١",
          "text_uthmani_tajweed": "١",
          "page_number": 1,
          "line_number": 2,
          "text": "١",
          "translation": {
            "text": "(1)",
            "language_name": "english"
          },
          "transliteration": {
            "text": null,
            "language_name": "english"
          },
          "translations": {
            "ms": "(1)",
            "id": "(1)",
            "en": "(1)"
          }
        }
      ],
      "translations": {
        "ms": "Dengan nama Allah, Yang Maha Pemurah, lagi Maha Mengasihani.",
        "id": "Dengan nama Allah Yang Maha Pengasih, Maha Penyayang.",
        "en": "In the Name of Allah—the Most Compassionate, Most Merciful."
      }
    },
    ...more verses...
  ]
}

API Reference

This project uses the Quran.com API v4. The main endpoints used are:

/verses/by_chapter/{chapter_number} - Fetches verses for a specific chapter
/chapters/{chapter_number} - Fetches chapter metadata

Rate Limiting

The script implements rate limiting to avoid overwhelming the API. Default settings:

50 requests per minute
3-second delay between chapter requests

Troubleshooting

Common Issues

API Rate Limit Exceeded
- Increase the delay between requests
- Reduce the number of concurrent requests
Network Errors
- Check your internet connection
- Verify API endpoint accessibility
- The script will automatically retry failed requests
Memory Issues with Large Datasets
- Adjust the per_page parameter
- Process chapters in smaller batches

Contributing

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Create a new Pull Request

Performance Tips

Use Bun runtime for faster execution
Adjust batch sizes based on your system's capabilities
Consider using stream processing for large chapters
Implement caching for frequently accessed data

Acknowledgments

Quran.com API for providing the data
Translation resources:
- MS (ID: 39): Abdullah Muhammad Basmeih - Malay translation
- EN (ID: 131): Dr. Mustafa Khattab, The Clear Quran - English translation
- ID (ID: 33): Indonesian Islamic affairs ministry - Indonesian translation

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
index.js		index.js
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quran Foundation API Scraper (Chapters, Verses, Translations, and Transliterations)

Features

Prerequisites

Installation

Using npm

Using Bun

Configuration

Usage

Output Format

Example Chapter JSON Output

API Reference

Rate Limiting

Troubleshooting

Common Issues

Contributing

Performance Tips

Acknowledgments

About

Languages

SyahmiRafsan/quran-api-scraper

Folders and files

Latest commit

History

Repository files navigation

Quran Foundation API Scraper (Chapters, Verses, Translations, and Transliterations)

Features

Prerequisites

Installation

Using npm

Using Bun

Configuration

Usage

Output Format

Example Chapter JSON Output

API Reference

Rate Limiting

Troubleshooting

Common Issues

Contributing

Performance Tips

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Languages