A package for scraping manga from various manga websites
Mangascraper is a package used to scrape mangas. It is a solution to retrieving mangas that do not offer an API. Mangascraper can run either asynchronously, returning a Promise
, or synchronously if a callback
function is provided.
npm install @specify_/mangascraper
Currently, mangascraper supports 5 sources, but will support more in the future.
Source | Supported? | Uses puppeteer? | Uses axios? |
---|---|---|---|
MangaBox | ✔️ | ✔️ | ✔️ |
Mangafreak | ❌ | —- | —- |
Mangakakalot | ✔️ | ❌ | ✔️ |
Manganato | ✔️ | ❌ | ✔️ |
Mangahasu | ✔️ | ❌ | ✔️ |
Mangaparkv2 | ✔️ | ✔️ | ❌ |
Mangasee | ✔️ | ✔️ | ❌ |
Readmng | ✔️ | ✔️ | ✔️ |
Kissmanga | ❌ | —- | —- |
If a supported source uses axios, mangascraper will try to use axios as much as possible to save computer resources. If the network request is blocked by Cloudflare, mangascraper will resort to using puppeteer.
If a supported source uses both axios and puppeteer, it means one or more methods in the source use either axios or puppeteer. For example, Readmng
uses puppeteer for search()
, but uses axios for getMangaMeta()
and getPages
To start using the package, import a class such as Mangakakalot
from the package and use the methods to get mangas from that source.
Here’s an example:
import { Manganato } from '@specify_/mangascraper';
const manganato = new Manganato();
(async () => {
const mangas = await manganato.search('One Piece');
const meta = await manganato.getMangaMeta(mangas[0].url);
console.log(meta.chapters);
})();
which outputs…
[
{
name: 'Chapter 1007',
url: 'https://readmanganato.com/manga-aa951409/chapter-1007',
views: '730,899',
uploadDate: 2021-03-12T07:00:00.000Z
},
{
name: 'Chapter 1006',
url: 'https://readmanganato.com/manga-aa951409/chapter-1006',
views: '364,964',
uploadDate: 2021-03-05T07:00:00.000Z
},
... and more items
]
If you already have an existing puppeteer endpoint, mangascraper can connect to that endpoint instead and perform faster concurrent operations.
Mangascraper also includes its own puppeteer launch arguments, and it is recommended to use them for scraping to go smoothly.
import puppeteer from 'puppeteer';
import { initPuppeteer, MangaSee } from '@specify_/mangascraper';
(async () => {
const browser = await puppeteer.launch({ ...initPuppeteer });
const endpoint = browser.wsEndpoint();
browser.disconnect();
const mangasee = new MangaSee({ puppeteerInstance: { instance: 'endpoint', wsEndpoint: endpoint } });
const mangas = await mangasee.search('Haikyu!');
})();
Since you are using your own puppeteer package, mangascraper cannot make any modificatins to the browser such as including a proxy.
const browser = await puppeteer.launch();
const mangapark = new MangaPark({
proxy: { host: '127.0.0.1', port: 8080 },
puppeteerInstance: { instance: 'custom', browser },
}); // ❌ Mangascraper cannot include proxy
const browser = await puppeteer.launch({ args: ['--proxy-server=127.0.0.1:8080'] });
const mangapark = new MangaPark({ puppeteerInstance: { instance: 'custom', browser } }); // ✔️ Our own browser instance will launch with a proxy
Because mangascraper is connecting to an existing endpoint, you must do all your browser arguments outside of mangascraper. See this for more on this.
If you want to override the launch arguments mangascraper uses, you can add this to any manga class such as MangaSee as long as you are using the default instance. Any other instance will require you to implement your own or inherit mangascraper’s puppeteer options with initPuppeteer
const mangasee = new MangaSee({ puppeteerInstance: { instance: 'default', launch: { ...myCustomLaunchOptions } } });
If you want to include a proxy, mangascraper will automatically put it into the launch arguments.
const manganato = new Mangahasu({
proxy: { host: 'proxy_host', port: 8080 },
puppeteerInstance: { instance: 'default' },
});
By using an existing puppeteer package in your app, this will enable mangascraper to use one browser instead of opening new browsers per operation. In addition, mangascraper will be able to scrape manga concurrently. With this approach, resources will be less intensive on chromium, and it can save you a lot of time if you are handling a lot of scraping operations. This is the best approach if you do not want to connect to an existing endpoint.
However, you must have puppeteer already installed.
This is the most basic setup:
import puppeteer from 'puppeteer';
import { MangaPark, initPuppeteer } from '@specify_/mangascraper';
(async () => {
const browser = await puppeteer.launch(initPuppeteer);
const mangapark = new MangaPark({ puppeteerInstance: { instance: 'custom', browser } });
})();
Since you are using your own puppeteer package, mangascraper cannot add any modifications to the browser such as including a proxy.
const browser = await puppeteer.launch();
const mangapark = new MangaPark({
proxy: { host: '127.0.0.1', port: 8080 },
puppeteerInstance: { instance: 'custom', browser },
}); // ❌ Mangascraper cannot include a proxy
const browser = await puppeteer.launch({ args: ['--proxy-server=127.0.0.1:8080'] });
const mangapark = new MangaPark({ puppeteerInstance: { instance: 'custom', browser } }); // ✔️ Our own browser instance will launch with a proxy
By default, mangascraper does not close the browser after the end of operation. If by any means you want to close the browser after an operation has finished. You can add the following to puppeteerInstance
puppeteerInstance: {
instance: 'custom',
browser: browser,
options: {
closeAfterOperation: true // After an operation is finished, close the browser
}
}
However, this will prevent mangascraper from proceeding to another operation after one is finished such as this example:
const mangapark = new MangaPark({ puppeteerInstance: 'custom', browser, options: { closeAfterOperation: true } });
await mangapark
.search('Naruto', { orderBy: 'latest_updates' })
.then(async (mangas) => await Promise.all(mangas.map((manga) => mangapark.getMangaMeta(manga.url)))); // ❌ Browser will close after gathering results of mangas that match the title Naruto and will not gather metadata from each source.
js
const mangas = await mangahasu.search('Fairytail');
console.log(mangas);
js
mangahasu.search('Fairytail', null, (err, mangas) => {
if (err) return console.error(err);
console.log(mangas);
});
js
import { Mangakakalot } from '@specify_/mangascraper';
const mangakakalot = new Mangakakalot();
mangakakalot.search('Black Clover', function (err, mangas) {
console.log(mangas);
});
js
import { Mangakakalot } from '@specify_/mangascraper';
const mangakakalot = new Mangakakalot();
mangakakalot.getMangas({ genre: 'Isekai' }, function (err, mangas) {
console.log(mangas);
});
js
import { Mangakakalot } from '@specify_/mangascraper';
const mangakakalot = new Mangakakalot();
mangakakalot.getMangaMeta('https://mangakakalot.com/read-qt9nz158504844280', function (err, meta) {
console.log(meta);
});
js
import { MangaNato } from '@specify_/mangascraper';
const manganato = new Manganato();
manganato.search('Naruto', null, function (err, mangas) {
console.log(mangas);
});
js
import { MangaNato } from '@specify_/mangascraper';
const manganato = new Manganato();
manganato.search(null, { genre: { include: ['Romance'], exclude: ['Drama'] } }, function (err, mangas) {
console.log(mangas);
});
js
import { MangaNato } from '@specify_/mangascraper';
const manganato = new Manganato();
manganato.getMangaMeta('https://readmanganato.com/manga-dr980474', function (err, meta) {
console.log(meta);
});
getMangas()
js
import { MangaNato } from '@specify_/mangascraper';
const manganato = new MangaNato();
manganato.getMangasFromGenre('Comedy', {}, (err, mangas) => {
console.log(mangas);
});
js
import { Mangahasu } from '@specify_/mangascraper';
const mangahasu = new Mangahasu();
mangahasu.search(null, null, (err, mangas) => {
console.log(mangas);
});
js
import { Mangahasu } from '@specify_/mangascraper';
const mangahasu = new Mangahasu();
mangahasu.getMangaMeta('https://mangahasu.se/shingeki-no-kyojin-v6-p27286.html', (err, meta) => {
console.log(meta);
});
js
import { Mangahasu } from '@specify_/mangascraper';
const mangahasu = new Mangahasu();
(async () => {
const mangas = await mangahasu.search('Attack on Titan');
const meta = await mangahasu.getMangaMeta(mangas[0].url);
const pages = await mangahasu.getPages(meta.chapters[0].url);
console.log(pages);
})();
js
import { MangaSee } from '@specify_/mangascraper';
const mangasee = new MangaSee({ debug: true }); // Opens puppeteer in headful mode
(async () => {
const mangas = await mangasee.search('the melancholy of haruhi suzumiya');
console.log(mangas);
})();
js
import { MangaSee } from '@specify_/mangascraper';
const mangasee = new MangaSee();
(async () => {
const mangas = await mangasee.directory();
console.log(mangas);
})();
js
import { MangaSee } from '@specify_/mangascraper';
const mangasee = new MangaSee();
(async () => {
const berserk = await mangasee.getMangaMeta('https://mangasee123.com/manga/Berserk');
console.log(berserk);
})();
js
import { MangaSee } from '@specify_/mangascraper';
const mangasee = new MangaSee();
(async () => {
const chapter363 = await mangasee.getPages('https://mangasee123.com/read-online/Berserk-chapter-363-index-2.html');
console.log(chapter363);
})();
js
import { MangaPark, initPuppeteer } from '@specify_/mangascraper';
(async () => {
const browser = await puppeteer.launch(initPuppeteer);
const mangapark = new MangaPark({ puppeteerInstance: { instance: 'custom', browser } });
const mangas = await mangapark.search('noragami');
const meta = await mangapark.getMangaMeta(mangas[0].url);
const pages = await mangapark.getPages(meta.chapters[meta.chapters.recentlyUpdated][0].pages);
console.log(pages);
})();
js
import { ReadMng } from '@specify_/mangascraper';
(async () => {
const readmng = new ReadMng();
const mangas = await readmng.search();
console.log(mangas);
})();
getMangaMeta
method of this class requires puppeteer, so if you want to get the manga meta, consider fetching to a custom API that uses the mangascraper package.tsx
import React from 'react';
import { MangaBox } from '@specify_/mangascraper';
const mangabox = new MangaBox();
const App: React.FC = () => {
const [pages, setPages] = React.useState<string[]>([]);
React.useEffect(() => {
mangabox
.getPages('https://mangabox.org/manga/solo-leveling-manhua-manga/chapter-159/')
.then((pages) => setPages(pages))
.catch((e) => console.error(e));
}, []);
return (
<div>
{pages.map((page) => (
<img src={page} />
))}
</div>
);
};
export default App;
Distributed under MIT © Joseph Marbella