Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

help: i18n sitemap with more than 50000 urls #372

Closed
chichi13 opened this issue Oct 13, 2024 · 3 comments
Closed

help: i18n sitemap with more than 50000 urls #372

chichi13 opened this issue Oct 13, 2024 · 3 comments
Labels
help wanted Extra attention is needed

Comments

@chichi13
Copy link

📚 What are you trying to do?

I have an i18n website with a lot of URLs, I'm about to pass the 50,000 URL mark. As you may know, Google limits the number of URLs per sitemap to 50,000. So I'd like to know how I can separate my sitemap into several? I looked at the documentation but couldn't get it to work. Here's the code I currently have:

sitemap: {
  cacheMaxAgeSeconds: 3600,
  gzip: true,
  exclude: ["/admin/**", "/auth/**"],
  urls: async () => {
    const baseUrl = process.env.API_BASE_URL || "http://localhost:8000";
    const languages = ["", "fr", "es"]; // Empty string for default locale (no prefix)
    const sources = [
      `${baseUrl}/api/v1/sitemap/events`,
      `${baseUrl}/api/v1/sitemap/streamers`,
      `${baseUrl}/api/v1/sitemap/games`,
    ];

    const fetchUrls = async (source: string) => {
      const response = await fetch(source);
      const urls = await response.json();
      return urls.flatMap((url) =>
        languages.map((lang) => ({
          loc: lang ? `/${lang}${url.loc}` : url.loc,
          lastmod: url.lastmod,
          priority: url.priority,
          image: url.image
            ? [
                {
                  loc: url.image.loc,
                  title: url.image.title,
                  caption: url.image.caption,
                },
              ]
            : undefined,
        }))
      );
    };

    const allUrls = await Promise.all(sources.map(fetchUrls));
    return allUrls.flat();
  },
},

As my URLs are the same between the different languages, the frontend divides the URL /fr, /es and / for English by default.

Currently my sitemap looks like this:

http://localhost:3000/__sitemap__/en-US.xml
http://localhost:3000/__sitemap__/es-ES.xml
http://localhost:3000/__sitemap__/fr-FR.xml

How can I get this kind of sitemap:

http://localhost:3000/__sitemap__/en-US.xml
http://localhost:3000/__sitemap__/en-US-2.xml
http://localhost:3000/__sitemap__/es-ES.xml
http://localhost:3000/__sitemap__/es-ES-2.xml
http://localhost:3000/__sitemap__/fr-FR.xml
http://localhost:3000/__sitemap__/fr-FR-2.xml

Or another solution?

🔍 What have you tried?

I've tried with sitemaps: true and defaultSitemapsChunkSize. I also tried with manual chunking but I couldn't do what I wanted to do.

ℹ️ Additional context

My backend is a FastAPI (Python) backend, I can of course change the code if needed.

@chichi13 chichi13 added the help wanted Extra attention is needed label Oct 13, 2024
@rayblair06
Copy link
Contributor

Unfortunately I don't think the chunking functionality is currently supported when combined with custom urls or sources.

#265

@chichi13
Copy link
Author

chichi13 commented Oct 15, 2024

Okay this is what I was thinking.

I ended up doing it like this:

  sitemap: {
    cacheMaxAgeSeconds: 3600,
    gzip: true,
    exclude: ["/admin/**", "/auth/**"],
    autoLastmod: true,
    sitemaps: {
      events: {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/events`,
        ],
      },
      streamers: {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/streamers`,
        ],
      },
      games: {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/games`,
        ],
      },
      "games-2": {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/games?page=2`,
        ],
      },
      "games-3": {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/games?page=3`,
        ],
      },
      "games-4": {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/games?page=4`,
        ],
      },
      "games-5": {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/games?page=5`,
        ],
      },
      "events-groups": {
        sources: [
          `${
            process.env.API_BASE_URL || "http://localhost:8000"
          }/api/v1/sitemap/events/groups`,
        ],
      },
      pages: {
        includeAppSources: true,
      },
    },
  },

Not a big fan because it's not dynamic. So if someone has a dynamic solution I'll take it :D

@harlan-zw
Copy link
Collaborator

Glad you could find a workaround, you will need to wait for official support of #265.

Will track in that issue.

@harlan-zw harlan-zw closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants